Restructure JSON Objects Using jq
You may encounter a scenario in which you want to produce JSON with a subset of
data from an input of JSON. Further, the input data may need to be presented
with a slightly different structure than the original. As you may already know,
jq
provides this capability. This post describes a specific scenario where
you need to map from the values of elements of a JSON object to an object where
the name of each element is the original element name, and the value is a
single value of the original object’s element.
To clarify the use case, start with the following input stored in addrs.json
.
{
"addrs": {
"o0": {
"addr": "addr0",
"v": "val0"
},
"o1": {
"addr": "addr1",
"v": "val1"
},
"o2": {
"addr": "addr2",
"v": "val2"
}
}
}
The goal is to produce a mapping of the names of each element of addrs
to the
value of each element’s value of addr
. That is, produce the following.
{
"o0": "addr0",
"o1": "addr1",
"o2": "addr2"
}
To first method shown encapsulates a few intermediate steps using the function
with_entries
. To help deepen the reader’s understanding how this function
works, the intermediate steps are elaborated. The solidify these concepts, some
additional examples are provided.
Converting an Object to List of Objects
jq
excels at extracting data from JSON. The goal of producing the JSON
described in the introduction can be achieved, after extracting the value of
addrs
, with a single command.
cat addrs.json | jq '.addrs | with_entries({name: .key, value: .value.addr})'
This takes the value of the element .addrs
and pipes it to with_entries(f)
,
where f
is a filter applied to each element.
On first encounter this may be an unsatisfying solution, as without an
understanding of what with_entries
does, it may be unclear how to formulate
the filter, or how to extend this filter when a slightly different scenario
arises. For example, from where did .key
arise? It is not in the original
JSON, nor is .value
. It does have an element from the original JSON, addr
,
but .value
is not present?
The answers to these questions can be obtained by understanding the equivalent
expression with_entries
encapsulates. It is an encapsulation of three
functions connected into a pipeline, to_entries | map({key: .key, value: .value.addr}) | from_entries
. Let’s break each expression in the pipeline in
turn. As a hint, with_entries
requires input objects with a specific
structure.
Produce an Array of Key-Value Pairs
The first expression in the pipeline is simply the function to_entries
. If
the input is an object, it maps each element to an array of objects with two
element, one named key
and one named value
.
cat addrs.json | jq '.addrs' | to_entries
This produces the following array.
[
{
"key": "o0",
"value": {
"addr": "addr0",
"v": "val0"
}
},
{
"key": "o1",
"value": {
"addr": "addr1",
"v": "val1"
}
},
{
"key": "o2",
"value": {
"addr": "addr2",
"v": "val2"
}
}
]
The elements key
and value
have made an appearance. As mentioned, the
output of to_entries
will always produce objects with exactly two elements:
key
and value
. How these are used in the filter are explained in the next
section.
Map Elements to Key-Value Pairs
It might seem like we are heading in the wrong direction. However, now we have
an array of objects that can be mapped to simpler objects using map(f)
. This
function applies the filter f
to each item in the array. In other words, .
is the current item, and the values of key
and value
’s addr
element can
thus be referenced with the filter .key
and .value.addr
respectively.
cat addrs.json | jq '.addrs | to_entries | map({key: .key, value: .value.addr})'
Notice that the value of f
is the same as the value used for with_entries
.
This produces the following array of objects.
[
{
"key": "o0",
"value": "addr0"
},
{
"key": "o1",
"value": "addr1"
},
{
"key": "o2",
"value": "addr2"
}
]
We now have exactly the data we need. However, the structure still needs
alteration. The elements key
and value
already made their appearance in
to_entries
, and are still present in this output. The value of the filter and
the reason behind its specific structure will become clear in the next section.
Map Back to an Object
The last step is to map this array of objects back to an object where the
element names and values are obtained from key
and value
respectively. This
can be accomplished with the from_entries
function.
cat addrs.json | jq '.addrs | to_entries | map({key: .key, value: .value.addr}) | from_entries'
This results in the desired JSON, exactly as the version with with_entries
.
{
"o0": "addr0",
"o1": "addr1",
"o2": "addr2"
}
The function from_entries
takes each value of the input array and creates an
object, essentially reversing the initial mapping, with map(f)
getting the
list of objects into the structure required by from_entries
. The function
to_entries
produces objects with two elements: key
and value
. The
function from_entries
takes objects that have two element (key
and value
)
and converts them to elements where the name is key
, and the value is
value
. For example, the object
{
"key": "o1",
"value": "addr1"
}
gets converted to the following.
{
"o1": "addr1"
}
The function with_entries
simply does the work of all three expressions in a
single expression. The filter passed to it (and likewise map
in the expanded
version) must create objects of that conform the structure required by
from_entries
.
Hopefully showing the intermediate steps and explaining the input and output
structures between them help clarify use of all these operations. To further
solidify your understanding, let’s proceed with a few more examples that use
jq
’s other features.
Add Some Numbers
It is no problem for jq
to add numbers in an array. The function add
takes
an array of numbers and produces a number. The following example demonstrates
this.
echo "[123, 456, 789]" | jq 'add'
1368
Seldom will input data be some simple, however. Let’s take the following JSON as our input. Maybe this is the list of books you read so far this month and you are curious how many total pages you have read.
{
"books": [
{
"title": "The Count of Monte Christo",
"pages": 1200
},
{
"title": "How to Do the Right Thing",
"pages": 201
},
{
"title": "Consider the Lobster",
"pages": 437
}
]
}
We know the input to add
is an array of numbers. So let’s make our first goal
to get the list of books. This can be obtained using the filter .books
.
cat numbers.json | jq '.books '
This produces the following.
[
{
"title": "The Count of Monte Christo",
"pages": 1200
},
{
"title": "How to Do the Right Thing",
"pages": 201
},
{
"title": "Consider the Lobster",
"pages": 437
}
]
With the list of books as our current output, each item of the array can be
mapped to an array of numbers using map(.pages)
. The filter .pages
is
applied to each book and map
returns an array.
cat numbers.json | jq '.books | map(.pages)'
[
1200,
201,
437
]
This is precisely the required input for add
. The last step is thus to pipe
this output to it.
cat numbers.json | jq '.books | map(.pages) | add'
And we get the following output.
1838
The next section expands on this construct.
Perform a Mathematical Operation On An Array of Values
A related use case is to perform a mathematical operation on each value in a list of values. To keep things brief while still providing the general concept, let’s take the case of rounding a each value in an array.
echo '[123.4, 124.5, 125.6]' | jq '[.[] | round]'
As you might expect, this rounds each value in the input array and produces an array of the rounded values.
[
123,
125,
126
]
Here is the equivalent version using map(f)
.
echo '[123.4, 124.5, 125.6]' | jq 'map(round)'
This is perhaps more readable, especially used in a map-reduce style of
operation. For example, to find the maximum value (that is, the reduce
operation), pipe the result to the max
function.
echo '[123.4, 124.5, 125.6]' | jq 'map(round) | max'
Get All the Values of a Given Element
For the final example, lets say we want to find all the values of the addrs
element of each thing
object in the things
array.
{
"things": [
{
"name": "thing-000",
"addrs": [
{
"names": [ "addr-001", "addr-000" ]
},
{
"names": [ "addr-001", "addr-002" ]
}
]
},
{
"name": "thing-001",
"addrs": [
{
"names": [ "addr-002", "addr-003" ]
},
{
"names": [ "addr-004", "addr-003" ]
}
]
},
{
"name": "thing-002",
"addrs": [
{
"names": [ "addr-004", "addr-005" ]
},
{
"names": [ "addr-004", "addr-006" ]
},
{
"names": [ "addr-004", "addr-005" ]
}
]
}
]
}
Again, we call upon map(f)
function to map from an array of objects to an
array of values.
cat things-addrs.json | jq '.things | map(.addrs[].names)'
This does not produce an array of objects, though. Instead we get a list of
lists, because the filter (.addrs[].names
) produced an array.
[
[
"addr-001",
"addr-000"
],
[
"addr-001",
"addr-002"
],
[
"addr-002",
"addr-003"
],
[
"addr-004",
"addr-003"
],
[
"addr-004",
"addr-005"
],
[
"addr-004",
"addr-006"
],
[
"addr-004",
"addr-005"
]
]
This is quite a common case, and the flatten
function takes each item of each
array and produces an array with all the values.
cat things-addrs.json | jq '.things | map(.addrs[].names) | flatten'
This has all the values contained in all the array as follows.
[
"addr-001",
"addr-000",
"addr-001",
"addr-002",
"addr-002",
"addr-003",
"addr-004",
"addr-003",
"addr-004",
"addr-005",
"addr-004",
"addr-006",
"addr-004",
"addr-005"
]
We might be, however, only interested in unique entries. Piping this output to
unique
removes duplicates.
cat things-addrs.json | jq '.things | map(.addrs[].names) | flatten | unique'
[
"addr-000",
"addr-001",
"addr-002",
"addr-003",
"addr-004",
"addr-005",
"addr-006"
]
Summary
Perhaps you have inherited some code that used with_entries
and did not
understand what it was doing. Or maybe you wrote that code and forgot what it
did an need to explain it to a colleague. Hopefully this explanation assisted
with these situations, and others as well. To close, here are a few key
takeaways.
-
Pipe arrays into
map(f)
to output arrays with items resulting fromf
. -
Pass the filter
f
tomap
such thatf
produces the desired JSON type (string, object, array, etc). -
When starting with an object and needing to combine the name of each element with its value, first convert it to name-value pairs using
to_entries
, then usemap(f)
to perform further mapping. -
When you need to calculate a specific resulting value from values contained in an element of each object contained in an array, use
map(f)
to extract the values and pipe its output to your required reduce operation (for example,add
ormax
).