Skip to main content
Skip to main content

JSONObjectEachRow

Description

In this format, all data is represented as a single JSON Object, each row is represented as a separate field of this object similar to JSONEachRow format.

Example Usage

Example:

{
"row_1": {"num": 42, "str": "hello", "arr": [0,1]},
"row_2": {"num": 43, "str": "hello", "arr": [0,1,2]},
"row_3": {"num": 44, "str": "hello", "arr": [0,1,2,3]}
}

To use an object name as a column value you can use the special setting format_json_object_each_row_column_for_object_name. The value of this setting is set to the name of a column, that is used as JSON key for a row in the resulting object. Examples:

For output:

Let's say we have the table test with two columns:

┌─object_name─┬─number─┐
│ first_obj │ 1 │
│ second_obj │ 2 │
│ third_obj │ 3 │
└─────────────┴────────┘

Let's output it in JSONObjectEachRow format and use format_json_object_each_row_column_for_object_name setting:

select * from test settings format_json_object_each_row_column_for_object_name='object_name'

The output:

{
"first_obj": {"number": 1},
"second_obj": {"number": 2},
"third_obj": {"number": 3}
}

For input:

Let's say we stored output from the previous example in a file named data.json:

select * from file('data.json', JSONObjectEachRow, 'object_name String, number UInt64') settings format_json_object_each_row_column_for_object_name='object_name'
┌─object_name─┬─number─┐
│ first_obj │ 1 │
│ second_obj │ 2 │
│ third_obj │ 3 │
└─────────────┴────────┘

It also works in schema inference:

desc file('data.json', JSONObjectEachRow) settings format_json_object_each_row_column_for_object_name='object_name'
┌─name────────┬─type────────────┐
│ object_name │ String │
│ number │ Nullable(Int64) │
└─────────────┴─────────────────┘

Inserting Data

INSERT INTO UserActivity FORMAT JSONEachRow {"PageViews":5, "UserID":"4324182021466249494", "Duration":146,"Sign":-1} {"UserID":"4324182021466249494","PageViews":6,"Duration":185,"Sign":1}

ClickHouse allows:

  • Any order of key-value pairs in the object.
  • Omitting some values.

ClickHouse ignores spaces between elements and commas after the objects. You can pass all the objects in one line. You do not have to separate them with line breaks.

Omitted values processing

ClickHouse substitutes omitted values with the default values for the corresponding data types.

If DEFAULT expr is specified, ClickHouse uses different substitution rules depending on the input_format_defaults_for_omitted_fields setting.

Consider the following table:

CREATE TABLE IF NOT EXISTS example_table
(
x UInt32,
a DEFAULT x * 2
) ENGINE = Memory;
  • If input_format_defaults_for_omitted_fields = 0, then the default value for x and a equals 0 (as the default value for the UInt32 data type).
  • If input_format_defaults_for_omitted_fields = 1, then the default value for x equals 0, but the default value of a equals x * 2.
Note

When inserting data with input_format_defaults_for_omitted_fields = 1, ClickHouse consumes more computational resources, compared to insertion with input_format_defaults_for_omitted_fields = 0.

Selecting Data

Consider the UserActivity table as an example:

┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┐
│ 4324182021466249494 │ 5 │ 146 │ -1 │
│ 4324182021466249494 │ 6 │ 185 │ 1 │
└─────────────────────┴───────────┴──────────┴──────┘

The query SELECT * FROM UserActivity FORMAT JSONEachRow returns:

{"UserID":"4324182021466249494","PageViews":5,"Duration":146,"Sign":-1}
{"UserID":"4324182021466249494","PageViews":6,"Duration":185,"Sign":1}

Unlike the JSON format, there is no substitution of invalid UTF-8 sequences. Values are escaped in the same way as for JSON.

Info

Any set of bytes can be output in the strings. Use the JSONEachRow format if you are sure that the data in the table can be formatted as JSON without losing any information.

Usage of Nested Structures

If you have a table with Nested data type columns, you can insert JSON data with the same structure. Enable this feature with the input_format_import_nested_json setting.

For example, consider the following table:

CREATE TABLE json_each_row_nested (n Nested (s String, i Int32) ) ENGINE = Memory

As you can see in the Nested data type description, ClickHouse treats each component of the nested structure as a separate column (n.s and n.i for our table). You can insert data in the following way:

INSERT INTO json_each_row_nested FORMAT JSONEachRow {"n.s": ["abc", "def"], "n.i": [1, 23]}

To insert data as a hierarchical JSON object, set input_format_import_nested_json=1.

{
"n": {
"s": ["abc", "def"],
"i": [1, 23]
}
}

Without this setting, ClickHouse throws an exception.

SELECT name, value FROM system.settings WHERE name = 'input_format_import_nested_json'
┌─name────────────────────────────┬─value─┐
│ input_format_import_nested_json │ 0 │
└─────────────────────────────────┴───────┘
INSERT INTO json_each_row_nested FORMAT JSONEachRow {"n": {"s": ["abc", "def"], "i": [1, 23]}}
Code: 117. DB::Exception: Unknown field found while parsing JSONEachRow format: n: (at row 1)
SET input_format_import_nested_json=1
INSERT INTO json_each_row_nested FORMAT JSONEachRow {"n": {"s": ["abc", "def"], "i": [1, 23]}}
SELECT * FROM json_each_row_nested
┌─n.s───────────┬─n.i────┐
│ ['abc','def'] │ [1,23] │
└───────────────┴────────┘

Format Settings