menu SweetData.io
close Home What is SweetData.io? Sign Up Login

What is candy.json

NEW: Try out our Candy.json generator for an easier boarding.
## We call datasets candies At sweetdata.io, we call datasets "candies" and the `candy.json` file is a file that *describes the content* of a candy package. > dataset == candy The `candy.json` file is as it extension suggests a **JSON** file. A JSON (JavaScript Object Notation) file is a simple file format that allows storing data structures. ## File structure The `candy.json` file has some root properties that are mandatory: - **handle**: *[string]* Candy Handle (unique across platform). The handle is used to access the candy through the URL https://sweetdata.io/candy/\_HANDLE\_ and through the CLI. - **name**: *[string]* Candy Name - **license**: *[string]* License of the candy - **author**: *[string]* SweetData.io author username (must match the username of the user account that will upload the candy) - **version**: *[string]* Version number in the format "1.0.0" - **datasets**: *[array]* Array containing the list of datasets in this candy. Each entry of the array represents an dataset schema. More info in the *Dataset object* section. ## Dataset object The datasets property is an array containing dataset objects each representing a dataset: - **datasets[].name**: *[string]* Human name of the dataset - **datasets[].handle**: *[string]* Handle of the dataset. Can only contain ASCII letters (lower-case), numbers and dashes. It must start with a letter. This handle allows accessing the dataset through the API. It must be unique among the datasets of this candy. - **datasets[].file**: *[string]* File name (path) of the CSV, XLS, XLSX or JSON file containing the data entries. An entry represents a line/row in the file (array element for JSON), and contains the values for each field of the dataset. - **datasets[].file**: *[array]* Array of file names (path) of the CSV, XLS, XLSX or JSON file containing the entries data. Useful if you want to shard your dataset across multiple files. - **datasets[].format**: *[string]* Format of the input file(s) (i.e. format of *datasets[].file*). Must be one of: csv, xls, xlsx or json - **datasets[].ignoreFirstLine**: *[boolean]* (optional, only for CSV, XLS and XLSX files, default=false) Whether or not to ignore the first line (header) of the dataset file. - **datasets[].fields**: *[array]* Array containing each field of the dataset, the fields must be in the same order as in the CSV, XLS and XLSX columns. More info in the *Field object* section. ## Field object - **datasets[].fields[].key**: *[string]* Handle-like field name (camel-case). Must start with lower-case letter. Can only contain ASCII letters (upper and lower-case) and numbers. In case of a JSON data file, this key must match with the JSON object key refering to that column. - **datasets[].fields[].name**: *[string]* Human name of the column - **datasets[].fields[].info**: *[string]* (optional) Information/Description about field - **datasets[].fields[].index**: *[boolean]* (optional) Whether or not to index this column (for faster read performance). Can only be set to true if the field type is (string|number|date) - **datasets[].fields[].type**: *[string]* Field value type (see *Type property*) - **datasets[].fields[].fields**: *[array]* (only when type=="object") Subfields of this property. - **datasets[].fields[].split**: *[string]* (only when type=="array.string") Delimiter that will be used to split the input value if it is a string rather that an array of strings. (useful for CSV values) ## Type property Bellow are the currently valid types that are supported by the SweetData.io platform. These types are used to describe the format of each value of a dataset column. - **string**: String value - **boolean**: Boolean value (true or false) - **number.float**: Floating point number (decimal number) - **number.integer**: Integer number - **date**: Date formatted for Javascript's Date constructor - **date.timestamp**: Unix timestamp in seconds - **date.timestampMs**: Unix timestamp in miliseconds - **date.year**: Year in the format YYYY - **object**: Object field - **array.\***: Array of type "*". Type must be one of the other valid types. E.g. array.string is an array of strings. - **file.image**: A file path to an image in the package (e.g. "files/faces/bob.png"). Recommended formats: PNG, BMP, GIF, JPEG, TIFF, WEBP. - **file.audio**: A file path to an audio file in the package (e.g. "files/hello.wav"). Recommended formats: WAV, MP3, 3GP, OGG, M4A. - **file.video**: A file path to a video in the package (e.g. "files/gestures/up.avi"). Recommended formats: AVI, MP4, MOV, MPEG, WEBM. - **file.blob**: A file path to an arbitrary file. ## Example For many (including machines), the best way to learn is through examples. You can find numerous examples of `candy.json` files and candy packages throughout the SweetData.io platform. You can download free datasets and see how the `candy.json` was setup for them. Here is a simple example of a `candy.json` file of the [currencies](https://sweetdata.io/candy/currencies) dataset: ``` { "handle": "currencies", "name": "Currencies List w/ ISO-4217 codes", "license": "Public Domain", "author": "sweetdata", "version": "1.0.1", "datasets": [ { "name": "Currencies list", "handle": "dataset", "file": "data.csv", "format": "csv", "ignoreFirstLine": true, "fields": [ { "key": "entity", "name": "Entity", "type": "string" }, { "key": "currency", "name": "Currency", "type": "string" }, { "key": "alphabeticCode", "name": "Alphabetic Code", "type": "string", "index": true }, { "key": "numericCode", "name": "Numeric Code", "type": "number.integer", "index": true }, { "key": "minorUnit", "name": "Minor Unit", "type": "number.integer" } ] } ] } ``` In this example, this dataset has only one file: `data.csv`. This file is a CSV file with the columns/fields *entity*, *currency*, *alphabeticCode*, *numericCode* and *minorUnit*. Some of those columns/fields are of the type `string` and others of the type `number.integer`.

Login

OR Create an Account