Skip to main content

· 3 min read

OpenRefine's packaging for MacOS and Windows could be improved in many ways, and we are looking for help in this front. We are looking for proposals from prospective contractors to improve the install experience of OpenRefine on MacOS and/or Windows. After a similar effort on Ubuntu/Debian packaging, this initiative is meant to improve the user experience on other platforms, to lower the install barrier for a broader audience. This project is funded by an EOSS Diversity and Inclusion grant from CZI.

We are hoping that those proposals could solve some of the following issues:

All new packaging steps should be integrated in our Continuous Deployment infrastructure (currently running on GitHub Actions), if not in our Maven packaging configuration itself.

We have funding to contract out this work to freelancers. To respond to this opportunity, please send the following to advisory.committee@openrefine.org:

  • a short description of the changes you propose to introduce and how they relate to the issues above
  • your price for this work
  • any pointers to some related work in other projects (or anything that can help us assess your ability to carry out the proposed work)

You are encouraged to discuss your proposed changes (such as the packaging tools you intend to use) with the contributor community (such as on the openrefine-dev mailing list or in GitHub issues).

Proposals will be reviewed by our project director and advisory committee. Feel free to get in touch with us if you have any questions about this call.

OpenRefine is fiscally sponsored by Code for Science and Society (CS&S). CS&S is an equal opportunity employer committed to hiring a diverse workforce at all levels of the organization thereby creating a culture that allows us to better serve our clientele, our employees and our communities. We value and encourage the contributions of our colleagues and strive to create an environment where everyone can reach their full potential and drive outstanding results. All qualified applicants will receive consideration for employment without regard to race, national origin, age, sex, religion, disability, sexual orientation, marital status, veteran status, gender identity or expression, or any other basis protected by local, state, or federal law. This policy applies with regard to all aspects of one's employment, including hiring, transfer, promotion, compensation, eligibility for benefits, and termination.

· 3 min read
Simple graphic showing group discussions with multicolored text balloons

tl;dr We consider moving OpenRefine's mailing lists and Gitter to a web-based Discourse forum, and invite your feedback.

This message has also been posted on OpenRefine's user and https://groups.google.com/g/openrefine-dev/ mailing lists on September 20, 2022.

For quite a while, some active OpenRefine community members ([Antonin](https://github.com/Antonin Delpeuch), [Sandra](https://github.com/Sandra Fauconnier), and the OpenRefine advisory committee) have been talking about ways to make OpenRefine’s community more lively, active and diverse.

As one step, we consider moving the current main community communication channels (the user mailing list, developer mailing list, and the Gitter chat) to a public, web-based forum, using the Discourse software. After the move, we will not delete the mailing lists, but keep them read-only as a public archive. We feel inspired to move forward to this, because in our latest user survey (held in April-May 2022), a web-based forum was the most popular option when we asked you for your preferred means of communications.

Our choice would be a hosted web-based Discourse forum, on a URL like forum.openrefine.org or similar. We would go for Discourse for many reasons. First, because it is well-designed and widely used open source forum software, but (among other things) also because it allows much more diverse discussion on 'smaller' topics (e.g. translation of OpenRefine's interface; discussion about larger feature requests; threads in languages other than English...). Discourse offers mailing list / email modes for people who prefer this as their main mode of communications.

We collected many considerations and pros/cons in this document, including the other options we could consider. We invite your comments and feedback there, or on OpenRefine's mailing list. Also, if anyone is interested in helping with the move, please let me know :-)

Tentative timeline for the move (our team is small, so this may be slower, or faster if a few of you volunteer to help!):

  • September 20, 2022 - OpenRefine’s community is informed about this plan; you can respond and comment for two weeks (and longer if we notice it is needed)
  • Week of Oct 3, 2022 (or later) - Decision go/no go based on your feedback
  • If there is general consensus: around the first weeks of November (depending on availability of Antonin, Sandra and volunteers) - start of actual migration to Discourse
  • If there is general consensus: end November 2022 - OpenRefine’s community uses Discourse for communications and mailing lists are now read-only!

Looking forward to your comments and feedback!

· 8 min read

Every two years, OpenRefine holds an extensive survey among its users. Our fifth edition was live in April-May 2022. No less than 207 people participated, which breaks our record of 2020 when we received 178 responses.

This year's survey was a bit more extensive than the previous ones (in 2012, 2014, 2018 and 2020); we now also included questions related to support and communications in the community. For the first time, we also asked you to give the software a general score. On average, survey respondents gave OpenRefine a solid 8 out of 10! This makes the team very happy, and of course (with your help) we hope to improve this score even more over time.

Now, on to more details.

Who are you?

In our 2020 survey, librarians (37.64%) formed an overwhelming majority of respondents. This year, we very proactively reached out to many of OpenRefine's typical user communities; this has resulted in a more 'even' distribution of sectors and communities that our survey respondents hailed from. Based on new popular answers in previous surveys, we added a few extra options to the question "Which field(s), discipline(s) or community/ies do you most identify with?" and respondents could provide more than one answer. Today, librarians are still the largest group of OpenRefine users (15.1%), followed by cultural sector professionals (11.6%), Linked Open Data / semantic web aficionadas (11.2%), researchers (10.1%) and data scientists (9.5%).

The vast majority of people use OpenRefine professionally, but 14.9% of our users do indicate that they mainly use the software in their free time. Unsurprisingly, many of them indicate that they are active in the Wikimedia or OpenStreetMap communities.

For the first time, we asked in which language(s) you use OpenRefine (both its interface and the datasets you interact with). English is dominant with 64.1%; followed by French (9.7%), German (8.1%), Spanish (6.5%) and various other languages (3.2%).

2022 survey respondents use OpenRefine a bit more often than we saw in previous editions.

And you are also increasingly becoming OpenRefine 'veterans', with a solid 66.7% saying that you have used the software for more than two years.

You are rating your OpenRefine skill level a bit higher than in the past, too.

You are using OpenRefine for roughly the same purposes as two years ago. We added "data imports from other resources" as a new option, and that's indeed frequently done. Analyzing existing datasets has become more popular (50.2% now), and preparing datasets before visualization in other applications is done less often (22.16%) than in 2020. Reconciliation (55.1%) also keeps (slowly) growing as a typical activity inside OpenRefine.

What does your OpenRefine installation look like?

53.4% of respondents usually update OpenRefine to its current stable release; an additional 23.8% use an earlier version of OpenRefine 3.x.

As we expected, most people (85%) work with a local version of OpenRefine. However, nearly 10% use, or even run, a version via cloud hosting. We are aware that our users are interested in this, and are curious to see whether this number increases over time.

Which plug-in(s) or extension(s) do you use in OpenRefine, if any? We have revamped this question a bit compared to previous years, because the OpenRefine extension ecosystem is in constant change. While more than 50% of users are either unaware of the existence of extensions, or don't consciously use any, the Wikidata extension (installed by default in OpenRefine) is used by no less than 26.9% of respondents. Other popular extensions are the RDF extension (8%), VIB-Bits (4%), and GeoJSON export (3.4%).

As mentioned above, reconciliation is slowly becoming more and more popular in OpenRefine. The Wikidata reconciliation service (shipped in OpenRefine by default) is quite dominant, used by 49.1% of survey respondents. VIAF (15.4%), the Getty vocabularies, and in-house reconciliation services (8.6%) follow in popularity. Under "Other", we see a few new responses from our community of users in the biodiversity domain: Bionomia and the GBIF taxonomy.

How do you perceive OpenRefine?

Which features make you choose OpenRefine over other tools? Many people mention that they appreciate OpenRefine's GUI, price (free!), flexibility, power, reconciliation features, and relative ease of use.

Which tool(s) would you use if OpenRefine would not be available to you? Excel is a winner here, and quite a few people also mention Python and R. QuickStatements would be an important alternative for Wikimedia and Wikibase users. One user mentions they would "sob the entire time", which we of course want to prevent.

How do you describe OpenRefine to someone else? Many of the descriptions involve the words "data", "powerful" and "cleaning", and we very much appreciate phrases like "spreadsheets on steroids", "the Ferrari of spreadsheets", "a librarian's dream", "swiss army knife", or simply "magic".

We received many feature requests as answers to the prompt "It would be awesome if OpenRefine...". Quite a few of these major requests are very familiar to us, and also mentioned on our roadmap.

  • Support for large datasets with many columns and/or rows (3 requests). Good news: we are working towards this goal for OpenRefine's major new 4.x release.
  • A better UX (3 requests), which makes it easier for newcomers to use OpenRefine (3 requests).
  • A free online instance of OpenRefine / hosted OpenRefine (3 requests).
  • Multi-user support in OpenRefine (2 requests).
  • Better Python support (3).
  • Allow working with R for syntax based work (2).
  • More 'point and click' functions to replace GREL (2).
  • Some simple data visualizations (2 requests), including the possibility to plot georeferenced data on a map.
  • Easier import from (2 requests) and reconciliation with external datasets via APIs
  • A feature to add new rows (2)
  • Better developed, more explicit and more detailed notifications and warnings (2)

Some feature requests are specific to Wikimedia, Wikibase and Wikidata support:

Several requests relate to reconciliation services:

  • Faster and more powerful reconciliation
  • Reconciliation against a SPARQL query
  • More reconciliation services
  • Less abandoned extensions and reconciliation services (cleanup of inactive and deprecated ones)

Some more requests related to usability and ease of use of OpenRefine:

  • Drag and drop for columns
  • Keyboard accessible GUI
  • Dark mode
  • A language that is more accessible than regex
  • Auto-update when new versions become available

Some requests relate to the way in which OpenRefine works with, and stores, files:

  • More transparent way to store files
  • Integration in OpenOffice/LibreOffice
  • Dynamic links with Google Sheets

Requests related to exporting data:

  • Improved workflow handling, including import/export and multi-project history
  • Preserve hierarchical structure of a dataset and export it too
  • Upload data directly into database
  • Better encoding of diverse characters during export

And finally:

  • More clustering algorithms
  • Improved record mode
  • Parse JSON or XML automatically
  • API calls beyond get
  • Make OpenRefine suited for georeferencing
  • More training (including in underrepresented contexts)

Communication, help and support

37.1% of survey respondents were unaware that OpenRefine has a user mailing list; 25.2% is subscribed to it. You can find, and subscribe to, the mailing list here: https://groups.google.com/g/openrefine

We asked if respondents would like to communicate with other OpenRefine users online and, if so, which channel(s) they would prefer. An online forum, like GitHub discussions or Discourse (like recently initiated by the OpenStreetMap community) is preferred, but our current mailing list is also appreciated. Slack, which is heavily used in many professional contexts, comes third. Good news: we are indeed investigating if an online forum, in addition to OpenRefine's user mailing list, would be useful and maintainable.

How you want to help OpenRefine

Finally, OpenRefine's 2022 survey included a question exploring in which ways the community would be willing to help the project. Many individuals and several institutions are interested in donating money, and quite a few people have indicated that they would be willing to translate OpenRefine's interface or participate in one of its committees. We thank everyone who expressed interest in this, and will follow up via email where relevant.

If you want to help translating OpenRefine's interface to your language, you can actually get started right away! We offer translation via the web-based Weblate tool. Just click to get started!

Many thanks to everyone who has completed the survey, and we wish you happy refining!

· 2 min read

This year, OpenRefine is very happy to participate in Outreachy again, an internship program in open source and open science initiated and run by the Software Freedom Conservancy. Outreachy provides paid, remote internships to people subject to systemic bias and impacted by underrepresentation in the technical industry where they are living. For OpenRefine, this is our second time welcoming and mentoring interns who actively improve our codebase and, while doing so, gain experience in open source development.

Outreachy logo

Our current participation in Outreachy is part of the project OpenRefine for Everyone, which focuses on making OpenRefine more useful for international, multilingual audiences, by removing cultural biases from the tool. This project has been generously funded by the Chan Zuckerberg Initiative (Diversity & Inclusion funding cycle).

From end May until end August 2022, we warmly welcome Elroy Kanye (from Bamenda, Cameroon) and Walton Goga (from Nairobi, Kenya) to our team. Elroy works on the implementation of server-side localization in OpenRefine, and is mentored by [Antonin Delpeuch](https://github.com/Antonin Delpeuch). Walton develops a SPARQL importer, with mentorship from Antoine Beaubien. Please join us in welcoming them, and don’t hesitate to thank them for their contributions as you see them appear in our repositories!

· 2 min read

OpenRefine’s advisory committee is growing! We are warmly welcoming Jan Ainali as a new member. OpenRefine’s advisory committee runs the administrative side of the project on a day to day basis.

Portrait photo of Jan Ainali

Photo: Jan Ainali, CC BY-SA 4.0

Jan brings a lot of valuable experience to OpenRefine and to the advisory committee. He is an active user of OpenRefine and a volunteer in the Wikimedia movement (representing one of OpenRefine’s larger user communities). In his day job as codebase steward at the Foundation for Public Code (based in Amsterdam, NL), Jan closely works with open source software projects developed by public organizations around the world. Previously, he also was (among others) a policy advisor at the European Parliament, and Executive Director of Wikimedia Sweden.

In Jan’s own words:

Since OpenRefine fills such an important role in the ecosystem and is a valuable tool for different communities, I am glad to join the advisory board. I hope that my experience from working with open knowledge, open data and open source, as well as some organizational development, will prove useful for the OpenRefine community.

Welcome, Jan!

The other members of the advisory committee are Antonin Delpeuch and Martin Magdinier. You can read more about OpenRefine’s governance and community in our Governance document.

· 2 min read

OpenRefine’s advisory committee is happy to welcome Sandra Fauconnier as OpenRefine’s project director. This is a part time position, funded by an Essential Open Source Software for Science grant awarded by the Chan Zuckerberg Initiative.

In this new role, Sandra will work closely with OpenRefine’s advisory and steering committees, and with its communities of users and contributors. She will help to improve our governance and community diversity, build community-driven structures to formalize OpenRefine’s roadmap and keep it up to date, will help us find and secure new sources of funding, and will support the advisory committee in OpenRefine’s day-to-day operations.

Sandra started using OpenRefine in 2015, and is currently using it in her free time on a nearly daily basis, mainly for batch editing data on Wikidata and Wikimedia Commons. As an art historian and active Wikimedia volunteer, she has lots of affinity with OpenRefine’s users from the cultural sector, libraries, and the Wikimedia movement. Since July 2021 she has been working with the OpenRefine team as project manager/product owner on our grant with the Wikimedia Foundation to develop integration with Wikimedia Commons.

Sandra is keen to get to know very diverse members from OpenRefine’s community. Here you find an open invitation, and various ways to contact her.

· 4 min read

OpenRefine is seeking a new member for its Advisory Committee (unpaid position).

OpenRefine is a powerful tool to clean messy data, popular in a diverse range of communities. It has been serving the needs of journalists, librarians, Wikipedians, scientists for more than ten years and is taught in many curricula and workshops around the world. OpenRefine received a two years grant from the Silicon Valley Community Foundation via the Chan Zuckerberg Initiative under their Essential Open Source Software for Science program, specifically focusing on improving Diversity and Inclusion in open source projects. OpenRefine is a fiscally sponsored project of Code for Science & Society Inc, a 501(c)(3) charitable organization in the USA.

Starting in 2019, OpenRefine explored a new sustainability model by leveraging grants and corporate sponsorship. Three years into that process, OpenRefine is now fiscally sponsored by Code for Science & Society Inc and secured four significant grants from three different organizations. As a result, the project matured with the creation of the Advisory, Steering, and Code of Conduct committees and started hiring contractors to advance our roadmap. During that time, we went through tremendous growth as we doubled the number of active contributors, increased the number of languages translated, and continued to see more users relying on OpenRefine.

Today we are looking for (a) new Advisory Committee member(s) to support our new model, strengthen our governance and continue to engage at an organizational level with our partners and community.

As its primary responsibility, OpenRefine's Advisory Committee oversees the project and coordinates with Code for Science & Society for its operations. So far, the advisory committee has taken care of the following executive responsibilities:

  • Manage grant applications;
  • Oversee hiring and management of contractors;
  • Prepare and approve budgets;
  • Work together with OpenRefine's Steering Committee to establish strategic partnerships with other communities and organizations.

In the next two years, we also want to:

  • Improve our governance and community diversity;
  • Renew our Advisory and Steering Committees, focusing on bringing users and institutions into the project's governance;
  • Formalize OpenRefine's roadmap;
  • Ensure the project’s financial sustainability by searching for new sources of funding

We are looking into hiring a project director to support the advisory committee in those missions (see the corresponding job posting for details).

About the position

We are seeking at least one new Advisory Committee member to help us pursue the goals described above, bringing their perspective on how OpenRefine should evolve.

This position is for you if you:

  • Care deeply about the future of OpenRefine;
  • Have good communication skills in English;
  • Are able to join a monthly call with Code for Science and Society where we coordinate the project's operations.
  • Are ready to invest at least 3 hours per month (between calls and document review).

This is a volunteer (non paid) position.

How to respond

Please nominate yourself or other members of the community by contacting advisory.committee@openrefine.org. The current advisory committee will engage a conversation with the nominees and select one or more new members. New members will be announced on our blog and mailing list.

OpenRefine is fiscally sponsored by Code for Science and Society (CS&S). CS&S is an equal opportunity employer committed to hiring a diverse workforce at all levels of the organization thereby creating a culture that allows us to better serve our clientele, our employees and our communities. We value and encourage the contributions of our colleagues and strive to create an environment where everyone can reach their full potential and drive outstanding results. All qualified applicants will receive consideration for employment without regard to race, national origin, age, sex, religion, disability, sexual orientation, marital status, veteran status, gender identity or expression, or any other basis protected by local, state, or federal law. This policy applies with regard to all aspects of one's employment, including hiring, transfer, promotion, compensation, eligibility for benefits, and termination.

· 3 min read

OpenRefine is developing features for Structured Data on Wikimedia Commons and welcomes new members to its team!

Earlier in 2021, OpenRefine has received a project grant from the Wikimedia Foundation to add functionalities for Structured Data on Wikimedia Commons (SDC). This has been a major ask from the Wikimedia community, and will provide OpenRefine with powerful features for batch editing and uploading files with structured data on Wikimedia Commons, the media repository behind Wikipedia.

Welcome to OpenRefine's new team members!

Development has started in September 2021, and we are very happy that our team is now complete! Join us in welcoming OpenRefine's new team members:

  • Eugene Egbe is a software engineer from Cameroon. He is an active member of the African Wikimedia developers' community and has worked on various software applications in the Wikimedia ecosystem, including the ISA Tool and Scribe. For OpenRefine's SDC project, Eugene develops the Wikimedia-specific features, including a Wikimedia Commons reconciliation service.

  • Joey Salazar is a software engineer from Costa Rica, graduated in China, working in Internet Governance spaces in Europe and America. Continually advocating for free speech online and open source technologies, Joey focuses on policies, standards, and protocol implementations, in particular regarding the DNS and related privacy and censorship considerations. In the SDC project, Joey takes care of software development in OpenRefine's own codebase.

  • Sandra Fauconnier is an art historian who has engaged with digital projects in the cultural sector for many years. She is an active Wikimedian and a prolific user of OpenRefine, and has worked for the Wikimedia Foundation as part of the team which developed Structured Data on Wikimedia Commons. Sandra acts as product manager for the SDC project in OpenRefine.

The team is mentored by Antonin Delpeuch.

We are developing these new features in close collaboration with the Wikimedia community. You can read monthly reports of ongoing work on meta.wikimedia.org. Wikimedia-specific tasks are managed on Wikimedia's bug tracking system Phabricator, and OpenRefine-specific development is tracked on GitHub. If you have a Wikimedia account, you can subscribe to receive updates about the project on a talk page of your choice.

Structured Data on Wikimedia Commons?

What is Structured Data on Wikimedia Commons? Quite a few OpenRefine users will be familiar with Wikidata, the multilingual Linked Open Data knowledge base of the Wikimedia movement. Since 2019, media files (digital photographs, videos, sound files, etc) on Wikimedia Commons, can also be described with multilingual linked data from Wikidata.

OpenRefine already includes support for Wikidata and, more recently, arbitrary Wikibases. The new Structured Data on Commons features in OpenRefine will build further upon this.

Here's an early preview of structured data for a list of Wikimedia Commons files as displayed in OpenRefine:

Commons files with SDC and Wikitext in OpenRefine

· 5 min read

OpenRefine is seeking a Part-time Project Director (paid position).

OpenRefine is a powerful tool to clean messy data, popular in a diverse range of communities. It has been serving the needs of journalists, librarians, Wikipedians, scientists for more than ten years and is taught in many curricula and workshops around the world. OpenRefine received a two years grant from the Silicon Valley Community Foundation via the Chan Zuckerberg Initiative under their Essential Open Source Software for Science program, specifically focusing on improving Diversity and Inclusion in open source projects. OpenRefine is a fiscally sponsored project of Code for Science & Society Inc, a 501(c)(3) charitable organization in the USA.

The OpenRefine team is seeking a Project Director to help the project grow its operations and community.

About the Job

  • Reports to the advisory committee
  • Supervises OpenRefine paid contractors
  • Schedule: part-time, the position can accommodate other commitments;
  • Duration: 24 months with possible extension as new funding is secured.
  • Start Date: As soon as you can!
  • Fully remote: you will be working with a team spread on multiple continents;
  • Compensation: We have a budget between USD 40,000 and 50,000 per year depending on experience, commitment, and contract type (see below)
  • Contract type: Code for Science & Society (our fiscal sponsor) administer contract and compensation. Depending on the director's country of residence and fiscal status, the position can be a contractor or employee.

Key Responsibilities

Starting in 2019, OpenRefine explored a new sustainability model by leveraging grants and corporate sponsorship. Three years into that process, OpenRefine is now fiscally sponsored by Code for Science & Society Inc and secured four significant grants from three different organizations. As a result, the project matured with the creation of the Advisory, Steering, and Code of Conduct committees and started hiring contractors to advance our roadmap. During that time, we went through tremendous growth as we doubled the number of active contributors, increased the number of languages translated, and continued to see more users relying on OpenRefine.

Today we are hiring a Project Director to support our new model, strengthen our governance and continue to engage at an organizational level with our partners and community. The project director's primary responsibility will be to lead the community by

  • Improving our governance and community diversity;
  • Formalizing the project roadmap;
  • Ensuring the project's financial sustainability by searching for new sources of funding;
  • Supporting the advisory committee in the day-to-day operations.

Governance, Inclusion, and Diversity

The project director will develop OpenRefine's existing governing bodies over the next 24 months, focusing on bringing users and institutions into the project's governance and increasing diversity along geographic, racial, and ethnic axes in governance. We want to develop a stronger sense of community among contributors, encouraging long-term commitment to the project and attracting new contributors.

Responsibilities include:

  • To update the project's governance, code of conduct, and contributing documents and create a safe and welcoming space for contributors;
  • To develop and document governance processes to support onboarding and leadership development.

Project Roadmap

The Project Director will work with our institutional partners, steering committee, and community to formalize and advertise the project roadmap.

Fundraising

The Project Director will seek new sources of funding to ensure the project's financial sustainability. Responsibilities include

  • To apply for funding to support the project, including grants and donations;
  • To report to funders;
  • To explore other funding models, including crowdfunding campaigns or pay for services.

Operation

The project director will support the advisory committee and work with Code for Science & Society in the day to day operations, by

  • Acting as a representative of the project to partners and in events and responsible for communication for the project;
  • Leading the hiring process for contractors/employees and be the person they report to;
  • Coordinating OpenRefine's participation in internship programs.

Qualifications

We are looking for candidates with:

  • Experience leading open-source and/or with volunteer-run projects;
  • Experience implementing strategic and operational plans;
  • Excellent communication skills;
  • Proven track record of fundraising for nonprofits;
  • Ability to work collaboratively and non-hierarchically;
  • Experience in remote collaboration and communication; comfort working remotely with colleagues in multiple time zones via Github, Mailing List discussion, Online Meeting, and other related tools;
  • Ability to work independently and take initiatives;
  • Familiarity with OpenRefine as a user;
  • Familiarity with one or more communities where OpenRefine is popular.

Not certain your credentials are a 100% match with the position description? Please apply anyway! We are looking to find the right person for our team, and we will help develop your skills during the project.

How to respond

Please apply here with your resume or CV, and a short letter of interest. We will schedule an interview with short-listed candidates. Applications will be reviewed on a rolling basis, starting November 10th.

OpenRefine is fiscally sponsored by Code for Science and Society (CS&S). CS&S is an equal opportunity employer committed to hiring a diverse workforce at all levels of the organization thereby creating a culture that allows us to better serve our clientele, our employees and our communities. We value and encourage the contributions of our colleagues and strive to create an environment where everyone can reach their full potential and drive outstanding results. All qualified applicants will receive consideration for employment without regard to race, national origin, age, sex, religion, disability, sexual orientation, marital status, veteran status, gender identity or expression, or any other basis protected by local, state, or federal law. This policy applies with regard to all aspects of one's employment, including hiring, transfer, promotion, compensation, eligibility for benefits, and termination.

· 4 min read

OpenRefine is seeking a Junior Developer - Structured Data on Wikimedia Commons additions to OpenRefine's codebase (paid contractor position).

OpenRefine is a power tool to clean messy data, popular in a diverse range of communities. It has been serving the needs of journalists, librarians, Wikipedians, scientists for more than 10 years, and is taught in many curricula and workshops around the world. OpenRefine is quite actively used on Wikidata, the structured data ‘sister’ of Wikipedia. In addition, thanks to a grant from the Wikimedia Foundation, OpenRefine will, between September 2021 and August 2022, be extended with structured data functionalities for Wikimedia Commons, the media repository of the Wikimedia ecosystem. This code extension will make it possible to batch edit structured data of existing files on Wikimedia Commons, and to batch upload new Wikimedia Commons files with structured data from the start. OpenRefine is a fiscally sponsored project of Code for Science & Society Inc, a 501(c)(3) charitable organization in the US.

The OpenRefine team is seeking a junior developer to help extend OpenRefine’s own code base with the abovementioned functionalities.

  • This is a part time, 8 months contract.
  • The work will take 35 weeks, from November 2021 until end June 2022.
  • For an average of 30 hours per week.
  • Fully remote. We encourage developers from outside of the USA and EU to apply.
  • We have between 36,000 USD and 42,000 USD available to complete this assignment, depending on experience. The payment details will be negotiated with the contractor, who will invoice Code for Science & Society for their work towards the corresponding goals.

Responsibilities

The Junior Developer:

  • Reworks OpenRefine’s Wikibase extension to work with any entity type, including the MediaInfo entity type used on Wikimedia Commons.
  • Adds support for Wikibase federation in OpenRefine’s Wikibase extension, so that Wikidata items can be used in structured data generated for Commons.
  • Develops export and upload functionalities of media files through OpenRefine (either from harddrive or from URL)
  • Works in close collaboration with their colleague (Wikimedia developer), and will regularly coordinate with the product manager and the rest of the OpenRefine development team.

You can read more about this project, the planned tasks and the various roles, in the public grant proposal on meta.wikimedia.org.

Qualifications

Please do not self-censor if you do not meet all of these criteria, as you will develop your skills during the project.

  • Experience developing in Java and Javascript.
  • Enthusiasm for writing good documentation and tests alongside your code.
  • Ability to work independently in a fully remote project.
  • Experience with open source development workflows on GitHub.
  • Familiarity with Wikibase and OpenRefine as a user.

How to respond

Please send your resume or CV, sample of your relevant previous work, and a short letter of interest to advisory.committee@openrefine.org. We will schedule an interview with short-listed candidates. Applications will be reviewed on a rolling basis, with an aim to fill the position by July 30.

OpenRefine is fiscally sponsored by Code for Science and Society (CS&S). CS&S is an equal opportunity employer committed to hiring a diverse workforce at all levels of the organization thereby creating a culture that allows us to better serve our clientele, our employees and our communities. We value and encourage the contributions of our colleagues and strive to create an environment where everyone can reach their full potential and drive outstanding results. All qualified applicants will receive consideration for employment without regard to race, national origin, age, sex, religion, disability, sexual orientation, marital status, veteran status, gender identity or expression, or any other basis protected by local, state, or federal law. This policy applies with regard to all aspects of one’s employment, including hiring, transfer, promotion, compensation, eligibility for benefits, and termination.