alexa
Facebook
Twitter
LinkedIn
Instagram
Whatsapp
Call Now
Quick Inquiry

Nested JSON to Multiple Dataframe in Pandas in python

Nested JSON to Multiple Dataframe in Pandas in python

The structure you are describing - a JSON of an indefinitely defined number of nested JSONs - fits exactly with a tree data structure. Since we are looking to store the ID of the parent JSON in each dataframe, we will approach this with BFS (breadth first search) a.k.a. level order traversal. This is a common graph traversal algorithm well suited to this kind of problem.

If a element has an id of None, it indicates it is the root or top level element.

 import json
import pandas as pd
import uuid


def nested_json_list_df(file : str):
    dict_list = []
    def bfs(json_dict : dict, node_name : str, parent_uuid : uuid.UUID):
        """Breadth First Search a.k.a. Level order Traversal
        """
        # Create parent node
        uuid_val = uuid.uuid1()
        out_dict = {'id': node_name, 'uuid': uuid_val, 'parent id': parent_uuid}

        # Search child nodes
        for key, val in json_dict.items():
            # If a child node is a dict itself, it is a sub-nested JSON
            if isinstance(val, dict):
                bfs(val, key, uuid_val)
            # A list of single-nested dicts is simply a new entry
            # A list containing dicts within dicts is interpreted
            # as another nested JSON
            elif isinstance(val, list):
                new_val = []
                for v in val:
                    if isinstance(v, dict):
                        new_dict = dict()
                        for key2, val2 in v.items():
                            # Indicates nested JSONs
                            if isinstance(val2, dict):
                                bfs(val2, key, uuid_val)
                            else:
                                new_dict[key2] = val2
                        new_val.append(new_dict)
                    else:
                        new_val.append(v)
                uuid2 = uuid.uuid1()
                out_dict2 = {'id': key, 'uuid': uuid2, 'parent id': parent_uuid, key : new_val}
                dict_list.append({key : out_dict2})
            else:
                out_dict[key] = val
        
        dict_list.append({node_name : out_dict})
        return dict_list
    
    ## Run BFS ##
    with open(file) as f:
        json_dict = json.load(f)
    df_list = []
    for d in bfs(json_dict, file, None):
        df_list.append(pd.DataFrame(d))
    return df_list


df_list = nested_json_list_df('temp.json')
for df in df_list:
     print(df) 

Output:

  Students
Students   [{'id': ',some id', 'value': 'some val'}, {'id...
id                                                  Students
parent id                                               None
uuid                    2d68cce3-c7f7-11ec-81a3-b0227ae68aa0
                                                       Error
Error      [{'id': ',some id', 'code': 'some code', 'emes...
id                                                     Error
parent id                                               None
uuid                    2d68cce4-c7f7-11ec-b0da-b0227ae68aa0
                                      temp.json
date                                 12:23:2022
id                                    temp.json
name                                  Some name
parent id                                  None
uuid       2d68cce2-c7f7-11ec-bcd8-b0227ae68aa0 

221 0
7

Write a Comments


* Be the first to Make Comment

GoodFirms Badge
GoodFirms Badge

Fix Your Meeting With Our SEO Consultants in India To Grow Your Business Online