How to build, test and run
You will need:
- OpenRefine source code
- Java JDK (Get OpenJDK from here.)
- Apache Maven (OPTIONAL)
- Node.js and npm
- A Unix/Linux shell environment OR the Windows command line
From the top level directory in the OpenRefine application you can build, test and run OpenRefine using the ./refine
shell script (if you are working in a *nix shell), or using the refine.bat
script from the Windows command line. Note that the refine.bat
on Windows only supports a subset of the functionality, supported by the refine
shell script. The example commands below are using the ./refine
shell script, and you will need to use refine.bat
if you are working from the Windows command line.
Get OpenRefine source code
With Git installed, use the git clone
command to download the project's repo to a directory of your choice.
Set up JDK
You must install JDK and set the JAVA_HOME environment variable (please ensure it points to the JDK, and not the JRE).
- Windows
- Mac
- Linux
- On Windows 10, click the Start Menu button, type
env
, and look at the search results. Click . (If you are using an earlier version of Windows, use the “Search” or “Search programs and files” box in the Start Menu.)
- Click Advanced window. at the bottom of the
- In the Environment Variables window that appears, click and create a variable with the key
JAVA_HOME
. You can set the variable for only your user account, as in the screenshot below, or set it as a system variable - it will work either way.
- Set the
Value
to the folder where you installed JDK, in the formatD:\Programs\OpenJDK
. You can locate this folder with the button.
First, find where Java is on your computer with this command:
which java
Check the environment variable JAVA_HOME
with:
$JAVA_HOME/bin/java --version
If this shows your Java version, your JAVA_HOME
variable is set up correctly. If it shows an error, you need to adjust it.
To do so, you can use:
export JAVA_HOME="$(/usr/libexec/java_home)"
Or, for Java 13.x:
export JAVA_HOME="$(/usr/libexec/java_home -v 13)"
With the terminal
Enter the following:
sudo apt install default-jre
This probably won’t install the latest JDK package available on the Java website, but it is faster and more straightforward. (At the time of writing, it installs OpenJDK 11.0.7.)
Manually
First, extract the JDK package to the new directory usr/lib/jvm
:
sudo mkdir -p /usr/lib/jvm
sudo tar -x -C /usr/lib/jvm -f /tmp/openjdk-14.0.1_linux-x64_bin.tar.gz
Then, navigate to this folder and confirm the final path (in this case, usr/lib/jvm/jdk-14.0.1
). Open a terminal and type
sudo gedit /etc/profile
In the text window that opens, insert the following lines at the end of the profile
file, using the path above:
JAVA_HOME=/usr/lib/jvm/jdk-14.0.1
PATH=$PATH:$HOME/bin:$JAVA_HOME/bin
export JAVA_HOME
export PATH
Note: OpenRefine on Linux currently supports jdk versions 8 to 15. Reference: Issue 4106.
Save and close the file. When you are back in the terminal, type
source /etc/environment
Exit the terminal and restart your system. You can then check that JAVA_HOME
is set properly by opening another terminal and typing
echo $JAVA_HOME
It should show the path you set above.
Maven (Optional)
OpenRefine development requires Apache Maven for its build, test, and packaging processing. We encourage using the latest version of Apache Maven for development of OpenRefine, otherwise sometimes spurious errors appear in your IDE regarding POM, dependencies, or packages.
If Maven is not already locally installed, then OpenRefine's build script will automatically download Maven for you and use it.
If you will be using your own Maven installation instead of OpenRefine's build script download installation, then set the MAVEN_HOME
environment variable. You may need to reboot your machine after setting these environment variables. If you receive a message Could not find the main class: com.google.refine.Refine. Program will exit.
it is likely JAVA_HOME
is not set correctly.
Ensure that you set your MAVEN_HOME
environment variable, for example:
MAVEN_HOME=E:\Downloads\apache-maven-3.8.4-bin\apache-maven-3.8.4\
NOTE: You can use Maven commands directly, but running some goals in isolation might fail (try adding the compile test-compile
goals in your invocation if that is the case).
Node.js and npm
The OpenRefine webapp requires node and npm to install package dependencies. Download and install Node.js. You should then have node and npm intalled. You can check the versions by typing:
node -v
npm -v
You can update the version of npm to the latest by typing
npm install -g npm@latest
Building
To see what functions are supported by OpenRefine's build system, type
./refine -h
To build the OpenRefine application from source type:
./refine clean
./refine build
Testing
Since OpenRefine is composed of two parts, a server and a in-browser UI, the testing system reflects that:
- on the server side, it's powered by TestNG and the unit tests are written in Java;
- on the client side, we use Cypress and the tests are written in Javascript
To run all tests, use:
./refine test
this option is not available when using refine.bat
If you want to run only the server side portion of the tests, use:
./refine server_test
If you are running the UI tests for the first time, you must go through the installation process.
If you want to run only the client side portion of the tests, use:
./refine ui_test chrome
Running
To run OpenRefine from the command line (assuming you have been able to build from the source code successfully)
./refine
By default, OpenRefine will use refine.ini for configuration. You can copy it and rename it to refine-dev.ini
, which will be used for configuration instead. refine-dev.ini
won't be tracked by Git, so feel free to put your custom configurations into it.
If you wish to run the application manually, without using the refine
script, you can do so via Maven with mvn exec:java
. The entry point of the application is the com.google.refine.Refine
class.
Building Distributions (Kits)
The Refine build system uses Apache Ant to automate the creation of the installation packages for the different operating systems. The packages are currently optimized to run on Mac OS X which is the only platform capable of creating the packages for all three OS that we support.
To build the distributions type
./refine dist <version>
where 'version' is the release version.
Building, Testing and Running OpenRefine from Eclipse
OpenRefine' source comes with Maven configuration files which are recognized by Eclipse if the Eclipse Maven plugin (m2e) is installed.
At the command line, go to a directory not under your Eclipse workspace directory and check out the source:
git clone https://github.com/OpenRefine/OpenRefine.git
In Eclipse, invoke the Import...
command and select Existing Maven Projects
.
Choose the root directory of your clone of the repository. You get to choose which modules of the project will be imported. You can safely leave out the packaging
module which is only used to generate the Linux, Windows and MacOS distributions.
To run and debug OpenRefine from Eclipse, you will need to add an execution configuration on the server
sub-project.
Right click on the server
subproject, click Run as...
and Run configurations...
and create a new Maven Build
run configuration. Rename the run configuration OpenRefine
. Enter the root directory of the project as Base directory
and use exec:java
as a Maven goal.
This will add a run configuration that you can then use to run OpenRefine from Eclipse.
Code style in Eclipse
You can apply the supplied Eclipse code style (in IDEs/eclipse/Refine.style.xml
) to make sure Eclipse lints your files according to the existing style.
Pull requests deviating from this style will fail in the CI.
You can manually apply the code style (regardless of your IDE) with the mvn formatter:format
command.
Testing in Eclipse
You can run the server tests directly from Eclipse. To do that you need to have the TestNG launcher plugin installed, as well as the TestNG M2E plugin (for integration with Maven). If you don't have it, you can get it by installing new software from this update URL https://testng.org/doc/download.html
Once the TestNG launching plugin is installed in your Eclipse, right click on the source folder "main/tests/server/src", select Run As
-> TestNG Test
. This should open a new tab with the TestNG launcher running the OpenRefine tests.
Test coverage in Eclipse
It is possible to analyze test coverage in Eclipse with the EclEmma Java Code Coverage
plugin. It will add a Coverage as…
menu similar to the Run as…
and Debug as…
menus which will then display the covered and missed lines in the source editor.
Debug with Eclipse
Here's an example of putting configuration in Eclipse for debugging, like putting values for the Google Data extension. Other type of configurations that can be set are memory, Wikidata login information and more.
Building, Testing and Running OpenRefine from IntelliJ idea
At the command line, go to a directory you want to save the OpenRefine project and execute the following command to clone the repository:
git clone https://github.com/OpenRefine/OpenRefine.git
Then, open the IntelliJ idea and go to file -> open
and select the location of the cloned repository.
It will prompt you to add as a maven project as the source code contains a pom.xml file in it. Allow auto-import
so that it can add it as a maven project.
If it doesn't prompt something like this then you can go on the right side of the IDE and click on maven then, click on reimport all the maven projects
that will add all the dependencies and jar files required for the project.
After this, you will be able to properly build, test, and run the OpenRefine project from the terminal. But if you will go to any of the test folders and open some file it will show you some import errors because the project isn't yet set up at the module level.
For removing those errors, and enjoying the features of the IDE like ctrl + click, etc you need to set up the project at the module level too. Open the different modules like extensions/wikidata
, main
as a project in the IDE. Then, right-click on the project folder and open the module settings.
In the module settings, add the source folder and test source folders of that module.
Then, do the same thing for the main OpenRefine project and now you are good to go.