Exploring JSON Parsing Techniques

Discover how to prompt an AI to deliver responses in structured JSON format, enabling precise extraction and handling of data. Learn the essentials of parsing JSON data using Python to streamline interactions with AI-generated responses.

Key Insights

Understand the fundamentals of JSON (JavaScript Object Notation), a lightweight, human-readable data interchange format used extensively for parsing structured data, including nested objects, arrays, booleans, numbers, and null values.
Utilize Python's built-in JSON library to effectively parse JSON strings into accessible dictionaries, eliminating the need for additional installations and simplifying data extraction processes.
Enhance AI interactions by explicitly instructing ChatGPT to format responses as JSON, thereby facilitating more precise data handling, such as differentiating between detailed "long answer" and concise "short answer" content.

Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.

This is a lesson preview only. For the full lesson, purchase the course here.

All right, lab 2, prompting the AI to respond in JSON format. In this lab, you will prompt the AI to answer in JSON format, which you will then handle accordingly by extracting the properties from the JSON. So from server 02 py, do a save as and call the new file server 02 lab py.

And then we need to install JSON, which is a module which allows us to work with JSON. JSON is a specially structured, JSON example, a special structure. JavaScript object notation is a lightweight data interchange format, easy for humans to read and write, easy for machines to parse and generate.

Parse means taking the JSON, which is a string, and busting it up into little keys. Constituent parts. So here is a simple example of JSON.

It's a dictionary key value pair with the key in double quotes and the value if it's a string in quotes, else not in quotes. Here you go. That person is set equal to a dictionary, but that's not JSON, because those aren't double quotes.

JSON is double quotes. Wikipedia, give me an example of some JSON. There you go.

Python for Data Science Bootcamp: Live & Hands-on, In NYC or Online, Learn From Experts, Free Retake, Small Class Sizes, 1-on-1 Bonus Training. Named a Top Bootcamp by Forbes, Fortune, & Time Out. Noble Desktop. Learn More.

Here's some JSON. Here is one JSON. This is one dictionary.

Now, JSON is transmitted across the internet as a string, which you then have to so-called parse, which means break it back into its constituent data points. So first name John, last name Smith. We have a Boolean.

We have the values can be any data type. We have age as a number. We have address, the value of which is itself a dictionary.

And phone number is an array or a list of multiple phone numbers. Children is a list of multiple kids. Spouse, null.

There is no spouse, or you could make a Boolean out of that if you like. False. Married, false.

So that piece of data. So what we can do is ask ChatGPT to give us our data in this format. If we wanted more fine control over the answer, say, of that question about Grand Slam, we could say, hey, give me a long answer and a short answer.

Just give me two different answers and provide the answer in JSON. So what you would get back is you would have a key that said long answer, and it would be blah, blah, blah. And then short answer, which would be a shorter blah, blah, blah.

And that would be coming back at you as this dictionary of two key value pairs, which we would then parse or break into its individual data points, which you could then handle. I'll put them individually. Pick one that you wanted to publish to the web.

That's what we're going to do. We're going to, so this is a lab. I mean, it's kind of challenging.

Just as well be a lesson. So you can, but I want to give everyone a chance every time to follow along, follow instructions, and see what you can do. It's kind of like following a cookbook, right? See if you can do it.

A little bit challenging, but see if you can make this thing. But if you just give it a try. And again, if you tackle this, when you see the solution implemented, I think it'll click better.

So pause the recording, try the lab, and then we'll execute the solution on the other side. OK, we're back with the solution to lab 02, prompting the AI to respond in JSON format as opposed to just text. And then we will extract the properties from the JSON.

So from server 02, we're going to do a save as and call the new file server 02 lab. File, save as. Server 02 lab.

Server 02 lab, right? All this other stuff is the same for now. In the terminal, we're going to install JSON, which will require us to quit any server that is running. And there isn't one running.

Great. We're going to say pip install JSON. Cannot find, ooh, ow.

All right, what's going on? Oh, you know what? Installing JSON might have a different syntax. Yeah, I think it does. It's kind of weird.

It's like, let's try this. Nope. It won't work.

Pip install, Python install JSON. I think I might have a note on this somewhere.

Let's see. Hmm. OK.

Tell you what. Let's ask the chatterbox, right? We're trying to talk to you, and we're trying to get JSON. Oh, help.

Pip install JSON isn't working on Mac. Is there some alternative syntax? Oh, part of the standard library. You don't need to install it.

OK, great. You just import it. Fantastic.

All right, we don't, sorry, don't install. Do not install. Take that out.

Yay. Much better. It's already in there.

OK, cool. We're just going to import it. Open AI.

Make sure we're in the right place. Import JSON. Fantastic.

Modify your prompt. That is your chat question. By requesting that the AI provide the answer in JSON, modify the content prompt to request the AI's answer.

OK, so I'm just going to copy paste. I'm going to type it. You should type it.

Let's type this stuff. Please provide your response in JSON format as a long answer and a short answer with the answers, the values of the long answer and short answer keys, or words that affect. Please provide your response in two parts, long answer and short answer in JSON.

You know this a little bit better. Wording, we'll just use this. We're telling it the names of the keys.

But that's not all. But you have to do a little bit more than that. You can't just say that.

You have to also kind of formally specify that. We're going to add a response format parameter to the request, the value of which is an object specifying JSON. We can't just say it in the prompt.

We have to also specify it as this new property. Response format equals dictionary type value JSON object. You can throw that in anywhere.

But usually, you'd put it before the messages. We'll say response format equals dictionary type JSON format. Yep, JSON object, rather, excuse me.

JSON object, comma. Now, instead of returning the response directly, because we're not trying to just dump raw data out here, we're going to save it to a variable. So the response object, which we send, let's call this the AI answer.

Because by the time we extract the text, it's not the whole response at that point. The response is all this. And then we're unpacking it just to get the AI answer here.

So all right, we already have a variable response. It's this entire object. And we'll just update that.

And then the AI's answer is JSON. It's a string. It's a JSON string.

But we have to parse it. Let's call it the, yeah, it's the, so AI answer is actually JSON as a string that we then pass to the JSON.loads method, which parses or unpacks it into a usable object or dictionary. So that's the JSON string.

And that's the JSON, that's the dictionary. In fact, this would be more accurate to say that it's JSON that we are then parsing into a JSON dictionary. That's your AI JSON answer.

That's the entire response. We're drilling in to get the answer, which is this JSON answer, which we're then going to pass to this JSON.loads method, which takes the stringy JSON and breaks it into a usable dictionary answer. So it'd be kind of like, go back to that JSON example for a moment here.

So JSON's a string. In fact, I'll tell you what we'll do here. I want to show you the fact that it's a string.

If we print this, the AI JSON data type should be string. We should expect to see that as a string, because that's what JSON is. We're going to parse the JSON string.

And then if you print the data type of that, you're going to get a dictionary. Should do anyway. So here we are.

We're just typing in the book. We need the actual. OK.

Parse, unpack, and parse the JSON returned by the AI. And by JSON, we mean it's going to give us something that looks like this. Long answer, grand slam, blah, blah, blah.

And short answer is going to be just blah, say. Shorter, like that. That's what we're expecting back.

We're not returning that. We want to save that. We're going to call it AI JSON.

And then we're going to say AI dictionary. After we unpack the JSON, after we parse the JSON, it's now no longer a string. OK, like when it comes back from the AI, it doesn't have to be from an AI API.

Just any API delivers JSON across a network. Any object like this across a network has to be stringified or serialized. So it's got wrapper quotes around it that basically have to be stripped off to gain access to the individual data points.

So that is why we use JSON.loads. And you pass in your stringified JSON, and it unpacks it as a dictionary. And we'd like to print this along the way in the terminal just to see if it's working as expected. Fix your quotes when you copy paste.

That's a string. And that should be a dictionary. So pause the recording.

Do all this stuff. So like when I'm really teaching, this is 20 minutes or so for students just to type this. And I'm going way slower, right? Because otherwise people get confused.

So you're going to get a response back from the AI API, right? ChatGPT knows to respond as JSON. It's told here in the response format, and it's also told in the prompt. So it's going to give you back this kind of data, a dictionary with a long answer and a short answer.

Key. Each of which has a value. The grand slam long answer, longer string, obviously.

What we do is take that special string and parse it, basically strip the wrapper quotes off of it so that it's an actual dictionary. To do that, you pass it to JSON.loads. And along the way, we're printing the data types as we go so that we can see. In fact, we'd love to see the dictionary and the string as we go, right? Just see.

Well, it's pretty long. Maybe we won't. We don't need to.

We don't need to see it. We'll see that in the browser. That's fine.

We just want to get the data type. If we see the data type, it works. We got the data.

OK, so now that we have the dictionary, what do we want to do with it? We want to JSON.loads your aijson. And then we're going to take your dictionary, and we're going to say we want the short answer. And we're going to return that.

We're not rendering template. There's no HTML page. We're just returning to the browser the value of the short answer key, right? So JSON is a dictionary, a stringified dictionary, all right, like that.

It's also called serialized. Serialized dictionary, stringified. Parse the JSON string as aijson, OK, the aijson.

OK, JSON. Parse unpack the aijson with JSON.loads, saving the result as aidictionary. There we go.

Aijson is the stringy thingy like that. The result is this, except with no quotes around it so that you have access to the actual data. The data type changes from string to dictionary.

We're going to return the short answer, meaning publish the short answer to the browser directly. All right, let's go. I think that's enough.

That should be sufficient. I mean, if we get it to work, we can come back here and output the long answer as well or instead. We're going to quit the currently running server and start the lab server.

This would have been a lot to do as a lab, obviously, you know. So I mean, it's basically a lesson, calling it a lab, but just to encourage you to try to read through steps a little bit. Here it's actually telling you exactly what to type.

I mean, a lab can involve you just simply typing what you're instructed to type. You know, that's fine as well. Just something to get you to stretch your wings a little bit.

Not that this isn't already doing that in the regular lessons. Okay, we're going to quit the currently running server. If it exists, it's not running.

And we're going to say server02lab.py. Okay, so it's running. Idea being, if we go to the home page and refresh, it's going to give us, it's going to do the response and dump out the short answer, which should be significantly shorter than the current answer, which is the long answer. So here we go.

Okay, let's take the URL. I guess I closed. So yeah, it's spinning its wheels, sending the request.

See the little spinner? Type error, okay. Error, function chat did not return a valid response. Oh, did I never return anything? That's possible.

Oh yeah, never returned anything. Okay. Return AI dictionary short answer.

Output the AI's short answer to the browser. All right, let's try that again. It's great to see these error messages though.

Function did not return a valid response, right? The function must return a value. And I had all that stuff, but never returned anything. In other words, never did anything with it in the final analysis.

Run it again, send the response again. Boom. All right, nice and short, right? We don't need, doesn't give any examples of baseball or tennis or anything.

Great, now we're going to say long answer. Long answer ought to have the, yeah, there it is. It's not as long as the original answer, but it's nice and tight.

In baseball, a batter hits a home run with the base loaded. Okay, there you go. And it mentions all four of the events in tennis.

That's perfectly fine and enough. Now, pretty print. Look at that.

There's a raw JSON or the raw dictionary. And let's look at, is it giving us the data types? Yes, check it out. AI JSON data type is string.

AI dictionary data type is dictionary. So it is printing that useful information. And that concludes the lab.

Key Insights

Brian McClain

How to Learn Python