How to read files in Java with Bazel
Bazel Build/Test Sandbox
What: The Bazel sandbox is an isolation mechanism that ensures targets only have visibility of
specified files.
Why: The sandbox ensures Bazel targets are compiled and tested with exactly the same files across
different build hosts, enabling hermetic build security, caching reliability, and identical remote
execution output.
When: The sandbox is utilized whenever you run ‘bazel build’ or bazel test
against a target.
How: No files are in the Bazel sandbox unless you declare them in your target definitions via your
BUILD
files and the files that you load
.
Where to Define Files for the Bazel Sandbox
The conventions used by the default Bazel Java Rules let you reference your files in a uniform way.
There are five attributes you should be aware of specifically for file access.
- srcs - Your Java code
- deps - Your library dependencies available at compile time
- runtime_deps - Your library dependencies available only at runtime
- resources - Other files included in the Java jar
- data - Other files available only at runtime in the runfiles folder
You can define macros can include default values for many of these attributes to cover the most common required values.
Each of these takes an array of string paths, relative to the BUILD file the target is defined from.
You can use the glob([“<pattern>”])
rule to
populate the array argument.
Reading Files in the Bazel Sandbox
The files defined in the last two attributes resources
and data
have different patterns for
accessing them in your code.
Resources
Resources are placed inside the Jar and are not extracted in the sandbox. Therefore you have to work
with them with tools that let you read inside the archive via the classpath.
The location of resources in the Jar is
intuitive for resources following the standard maven directory structure, but can be more complex
for files in other locations.
For the following examples consider the following directory structure:
- src/test/resource/
- testResource.txt
- resourceFolder/
- testResourceA.txt
- testResourceB.txt
Each of the *.txt files have a single string of text “Hello <filename>”, where filename does not
include the .txt file extension.
Common Mistakes
Treating Resources as Normal Files
@Test
public void testBrokenResourceVanillaViaFileInputStream() {
final URL resource = this.getClass().getResource("/testResource.txt");
Assert.assertNotNull("Resource will not be null", resource);
try (final FileInputStream contents = new FileInputStream(resource.getFile())) {
Assert.fail("Should have thrown an exception");
Assert.assertNull(contents);
}
catch (FileNotFoundException e) {
Assert.assertNotNull(e);
}
catch (IOException e) {
Assert.fail("Should have thrown a FileNotFound exception instead of IOException");
}
}
You have to remember resources are inside the jar file. You can’t access them like a normal file on disk.
Incorrect Paths
@Test
public void testBrokenResourceVanillaViaMissingDelimiter() throws Exception{
try (final InputStream contents = this.getClass().getResourceAsStream("testResource.txt")) {
Assert.assertNull(contents); //The contents will be empty
}
}
The presence of the leading “/” is important to the search order.
Reading a Resource
With java.io
@Test
public void testResourceVanilla() throws Exception{
try (final InputStream contents = this.getClass().getResourceAsStream("/testResource.txt")) {
String s = new String(contents.readAllBytes(), StandardCharsets.UTF_8);
Assert.assertEquals(TEST_RESOURCE_CONTENTS, s);
}
}
With com.google.common.io.Resources
@Test
public void testResourceGoogle() throws Exception{
URL url = Resources.getResource("testResource.txt");
String s = Resources.toString(url, Charsets.UTF_8);
Assert.assertEquals(TEST_RESOURCE_CONTENTS, s);
}
Exploring a Resource Folder
@Test
public void testResourceSearchFolderGoogle() throws Exception{
URL url = Resources.getResource("resourceFolder");
File f = new File(url.getFile());
Assert.assertNotNull("It seems like you can read the folder as a file", f);
Assert.assertTrue(!f.exists()); // But you can't since it is in the Jar
// You can enumerate the entries directly from the Jar
JarURLConnection jarConnection = (JarURLConnection) url.openConnection();
JarFile jarFile = jarConnection.getJarFile();
Iterator<JarEntry> iter = jarFile.entries().asIterator();
ArrayList<JarEntry> entries = Lists.newArrayList(iter);
// You get all entries in the Jar, not just ones under your url
Assert.assertTrue(entries.size() > 2);
JarEntry entry = jarFile.getJarEntry("resourceFolder");
// It can tell if this entry is a directory
Assert.assertTrue(entry.isDirectory());
//Directory entry names end with the path delimiter
Assert.assertTrue(entry.getName().endsWith("resourceFolder/"));
}
Data
Data files are not included in the Jar, but instead put in a runfiles folder. They are accessible
relative to the root of the repo, not relative to the BUILD file of the target that they are
referenced from i.e. path/to/build/file/src/test/data_folder/data_file
, not
src/test/data_folder/data_file
.
With java.io
public void testDataVanilla() throws Exception {
File dir = new File("path/to/build/file/src/test/dataFolder");
Assert.assertTrue(dir.isDirectory());
var file = dir.listFiles()[0];
Assert.assertTrue(file.isFile());
FileInputStream stream = new FileInputStream(file);
var s = IOUtils.toString(stream);
Assert.assertEquals(s, "Hello dataFile");
}
With java.nio and com.google.devtools.build.Runfiles
@Test
public void testDataGoogle() throws IOException {
Runfiles runfiles = Runfiles.create();
String filepath = "_main/path/to/build/file/src/test/dataFolder/dataFile.txt";
Path path = Paths.get(runfiles.rlocation(filepath));
String s = Files.readString(path);
Assert.assertEquals(s, "Hello dataFile");
}
Using java.nio
should generally result in more performant code. Using
[com.google.devtools.build.Runfiles](https://github.com/bazelbuild/bazel/blob/7.2.1/tools/java/runfiles/Runfiles.java#L31-L132)
should result in more stable code across different hosts and future updates.
The hardcoded \_main
Bazel workspace should be changed to be determined programmatically. Updating
your code to do so is left as an exercise for the reader.