NEW: Try out our Candy.json generator for an easier boarding.
## We call datasets candies
At sweetdata.io, we call datasets "candies" and the `candy.json` file is a file that *describes the content* of a candy package.
> dataset == candy
## File structure
The `candy.json` file has some root properties that are mandatory:
- **handle**: *[string]* Candy Handle (unique across platform). The handle is used to access the candy through the URL https://sweetdata.io/candy/\_HANDLE\_ and through the CLI.
- **name**: *[string]* Candy Name
- **license**: *[string]* License of the candy
- **author**: *[string]* SweetData.io author username (must match the username of the user account that will upload the candy)
- **version**: *[string]* Version number in the format "1.0.0"
- **datasets**: *[array]* Array containing the list of datasets in this candy. Each entry of the array represents an dataset schema. More info in the *Dataset object* section.
## Dataset object
The datasets property is an array containing dataset objects each representing a dataset:
- **datasets.name**: *[string]* Human name of the dataset
- **datasets.handle**: *[string]* Handle of the dataset. Can only contain ASCII letters (lower-case), numbers and dashes. It must start with a letter. This handle allows accessing the dataset through the API. It must be unique among the datasets of this candy.
- **datasets.file**: *[string]* File name (path) of the CSV, XLS, XLSX or JSON file containing the data entries. An entry represents a line/row in the file (array element for JSON), and contains the values for each field of the dataset.
- **datasets.file**: *[array]* Array of file names (path) of the CSV, XLS, XLSX or JSON file containing the entries data. Useful if you want to shard your dataset across multiple files.
- **datasets.format**: *[string]* Format of the input file(s) (i.e. format of *datasets.file*). Must be one of: csv, xls, xlsx or json
- **datasets.ignoreFirstLine**: *[boolean]* (optional, only for CSV, XLS and XLSX files, default=false) Whether or not to ignore the first line (header) of the dataset file.
- **datasets.fields**: *[array]* Array containing each field of the dataset, the fields must be in the same order as in the CSV, XLS and XLSX columns. More info in the *Field object* section.
## Field object
- **datasets.fields.key**: *[string]* Handle-like field name (camel-case). Must start with lower-case letter. Can only contain ASCII letters (upper and lower-case) and numbers. In case of a JSON data file, this key must match with the JSON object key refering to that column.
- **datasets.fields.name**: *[string]* Human name of the column
- **datasets.fields.info**: *[string]* (optional) Information/Description about field
- **datasets.fields.index**: *[boolean]* (optional) Whether or not to index this column (for faster read performance). Can only be set to true if the field type is (string|number|date)
- **datasets.fields.type**: *[string]* Field value type (see *Type property*)
- **datasets.fields.fields**: *[array]* (only when type=="object") Subfields of this property.
- **datasets.fields.split**: *[string]* (only when type=="array.string") Delimiter that will be used to split the input value if it is a string rather that an array of strings. (useful for CSV values)
## Type property
Bellow are the currently valid types that are supported by the SweetData.io platform. These types are used to describe the format of each value of a dataset column.
- **string**: String value
- **boolean**: Boolean value (true or false)
- **number.float**: Floating point number (decimal number)
- **number.integer**: Integer number
- **date.timestamp**: Unix timestamp in seconds
- **date.timestampMs**: Unix timestamp in miliseconds
- **date.year**: Year in the format YYYY
- **object**: Object field
- **array.\***: Array of type "*". Type must be one of the other valid types. E.g. array.string is an array of strings.
- **file.image**: A file path to an image in the package (e.g. "files/faces/bob.png"). Recommended formats: PNG, BMP, GIF, JPEG, TIFF, WEBP.
- **file.audio**: A file path to an audio file in the package (e.g. "files/hello.wav"). Recommended formats: WAV, MP3, 3GP, OGG, M4A.
- **file.video**: A file path to a video in the package (e.g. "files/gestures/up.avi"). Recommended formats: AVI, MP4, MOV, MPEG, WEBM.
- **file.blob**: A file path to an arbitrary file.
For many (including machines), the best way to learn is through examples. You can find numerous examples of `candy.json` files and candy packages throughout the SweetData.io platform. You can download free datasets and see how the `candy.json` was setup for them.
Here is a simple example of a `candy.json` file of the [currencies](https://sweetdata.io/candy/currencies) dataset:
"name": "Currencies List w/ ISO-4217 codes",
"license": "Public Domain",
"name": "Currencies list",
"name": "Alphabetic Code",
"name": "Numeric Code",
"name": "Minor Unit",
In this example, this dataset has only one file: `data.csv`. This file is a CSV file with the columns/fields *entity*, *currency*, *alphabeticCode*, *numericCode* and *minorUnit*. Some of those columns/fields are of the type `string` and others of the type `number.integer`.