How to generate fake data in multiple languages with Dummy4j

featured image

Reliable sets of automatically generated data enhance project testing and development. If your application has to handle non ASCII characters or support multiple languages, it is beneficial to consider those requirements when providing fake data.

Prepare definitions for new languages

By default, Dummy4j will look for definitions under the resources/dummy4j path. Below you can see the yaml definitions for names in Finnish and French added to a project (names in English are available by default):

multilingual definitions for fake data

Bear in mind that the .yml files can be named whatever you like and don’t have to be inside their own folder. In fact, you could place all definitions in one file. What is important is the structure of those files.

The structure of the dummy4j/fr/name.yml file mirrors the default definition list:

Moreover, we’re not limited to keeping definitions for a new locale only in the dummy4j directory. Just keep in mind that using files from a different location within the resources/ folder requires specifying the paths parameter for the Dummy4jBuilder:

Generate a separate data set for each language

In order to work with multilingual data, we’ll have to create separate dummies:

Thanks to that, we can generate data in the language specified for the particular dummy:

Note that both dummyForFinnish and dummyForFrench were provided the “en” locale as a fallback – if a definition is not found for their main locale, it will be resolved from the next one. This way we can extend the language capabilities of Dummy4j, while still being able to rely on its default definitions for other methods.

Troubleshooting

Resolver returns data in the wrong language

Make sure that the file structure is compliant with the one used by the resolver. E.g. if you look at the default definitions, the key used for generating first names requires that the actual values are available under the name.male_first_name and name.female_first_name keys:

Therefore, generating male names in French for the definitions specified below won’t work:

Furthermore, do not forget to include the locale in the definition files:

“Could not find definitions for locale” exception

When you get the “Could not find definitions for locale : en. Make sure its definitions are included in the provided paths” message it means that you specified a locale that is not available in any definition file. In that case, you have to verify whether every locale specified for a Dummy4j instance can actually be found in the files provided under the resources/dummy4j directory or under the paths you specified when creating a dummy.

More on generating random data in Java

Photo by Erika Cristina from Pexels

Leave a Reply

Your email address will not be published. Required fields are marked *