Discover how string methods and immutability work in Python, and learn efficient techniques for manipulating text. Explore practical examples, including formatting strings and constructing URLs from news headlines.
Key Insights
- Understand that strings are immutable in Python; methods like
.lower()
,.upper()
, and.replace()
produce a modified copy rather than altering the original string directly. To retain changes, you must explicitly assign the result back to the original or a new variable. - Learn the differences between mutable and immutable types by comparing lists and strings: list methods such as
.append()
or.reverse()
directly modify the original list without needing reassignment, unlike string operations. - Gain proficiency in constructing readable and concise strings through f-string formatting, which eliminates the need for numerous plus signs and string conversions, particularly in complex concatenation scenarios or when combining variables and literal text.
Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.
This is a lesson preview only. For the full lesson, purchase the course here.
String methods. Strings are immutable, meaning they cannot be changed. So what that means is, if you have fruit equals apple and you change fruit to banana, you're not really changing the letters, right? That's impossible.
You're just telling, you're telling the variable to just go point at some other value. When you call a string method, a method on a string, because strings are immutable and cannot be changed, it's always producing an answer. It's basically taking your instructions.
The string method is serving as an instruction to make something, not to change something. So it makes you a new thing. Now, if you'd like the new thing to kind of pass for the original, you just set, you just keep the same variable name.
So the new thing under the variable name of the old thing takes on these made to order specifications. So in other words, if you want to lowercase a string, you could say strut out lower and it'll work, but it won't save it unless you set it equal to itself. We're going to look at how that works.
Strut. So notice it's always set equal to itself. Well, you can set it equal to something else like string lowercase, but if you'd like to keep the one variable name going and you want to modify it with a string method, you set, you do the command, but you set it equal to itself so that the return copy goes under the same name.
It looks like you changed the original, but you actually just made a copy by the same name. So dot lower lowercases, upper uppercases, capitalize will capitalize the string. Title will capitalize every word in the string and str.replace works on a find and replace concept.
You feed in two arguments, what you're finding and what you're replacing with. Split splits a string into a list of words and string.split splits a string into a list of characters. Actually, just passing a string to the list method will turn the string into a list of characters.
You can also feed in a delimiter. So let's say you split on a dash splits a string into a list of words on the dashes. So let's say you had a file name like, well, we'll get into it.
We have examples. Okay, let's take this for a spin. So in one of the big concepts we want to make sure we understand as we do this is that you're always returning something because a string is immutable.
You can't directly change the string the way you can directly sort a list. Lists are mutable. So you have a pet, mixed case, and if you print pet, there it is.
Now you could print pet to lower and it's lowercase and you think it changed it, but strings are immutable. They can't be changed. If you print pet again, it's back how it was.
So all you printed was what you would get if you did save the change, if you did make the copy. So what we're going to do then is to run that again. We'll say pet equals pet lower.
We're going to save the change. There. Now it really changed it.
If you just run a string method, it'll work transiently, but it won't stamp the change onto the original. It'll return a copy, which in the case of printing, you're just not saving. So you have to set it equal to itself to save the return value.
It's returning something, giving you an answer. You have to catch it. So by contrast, lists are mutable.
You can do stuff to list and it sticks. We could say fruits.reverse and then fruits.append. And it just doesn't matter that you didn't save it. You don't notice these moves are not being saved to anything and it completely works.
You've appended kumquat and reversed all the fruits without saving anything to anything. So you can do stuff to lists and it sticks. You can call methods on lists and the changes stick, persist, without having to set the method call equal to the list or some new var like you have to do with string methods.
I don't know if that makes a lot of sense, but we've been working a lot with lists where we just run these methods without setting anything equal to anything. But if you're running string methods, you have to set the method call equal to itself typically, or a new variable anyway, something to capture the return value. All right, let's try strings uppercase.
We'll say change pet to dog and we'll print pet upper and then we'll print pet again. So you print pet upper and it totally looks like pet got uppercase, but it only got uppercase here in the print statement and then you didn't catch the return value. You didn't set it equal to itself, so it doesn't work.
So let's say pet equals pet.upper, catch the return value, and now pet is truly uppercase for real. The first time you print pet after doing the upper is back to how it was because we made the change, we only printed the change, we didn't catch the return value of the change, we didn't set the operation equal to itself. To make it stick in the second case we did, hence when we print pet it is uppercase for good.
Let's capitalize. We have way out of money. I'm going to say question and we'll print, we're going to capitalize it, we'll print question dot capitalize.
It looks like it worked, but then if you print the question again you see that it didn't work. So again what we do is you run the move setting it equal to itself to stamp the change, catch the return value, and there we go. Title case, same move, we could say print title capitalizes every word.
We'll say question, print question dot title, but okay it looks like it worked, capitalizes every word, but if you print the question again it's back to how it was. So again the go-to move is you run the operation, but you set it equal to itself to catch the return value, which is the answer or the result, and then it sticks. Because strings are immutable you can't change them directly, you can only make a copy to spec and then catch that answer, return value, under the same name if you like to make it look like you changed it.
All right string dot string equals string replace, let's find and replace characters. So let's say we have, let's start with just one letter, we'll say British English equals what is my favorite color is, I don't know, whatever, peach, American English, we wouldn't say OU, right, equals British English dot replace, we're going to, what are we going to replace, we're going to replace the OUR with OR and we'll print them both. So it's the same principle, we have to catch the return value, but a little bit different from what we did before where we're preserving the original names, and British English is still going to be British English.
Yep, American English, OUR becoming OR. All right, I'll make a challenge out of this, replace cats with dogs, pause and see if you can replace cats with dogs. All right, we're going to do this little mini challenge, we'll say claim or boast, cats are the best, boom, and we're going to change the boast, we'll just keep the same variable name unlike, well we can say cat boast and dog boast, whoops, let's make them different, we'll keep the cat thing.
Dog boast is going to be equal to cat boast dot replace, and we're going to replace cats with dogs, and we'll just print them both, there, that works.
All right, now we've done some concatenation with plus signs and sometimes it's gotten a little long. Remember we were doing the SAT score, we had to say "your SAT score report, " "your math score was" plus math score, plus "your verbal score was" plus verbal score, and then it didn't work unless we put str()
around the variables, because you can't CONCATENATE numbers.
Kind of a pain. A lot of plus signs, a lot of having to make sure you're not conflicting in the concatenation by doing numbers—you know, turn the numbers into strings. It's pretty tedious. You get the plus signs wrong or forget a plus sign.
So there's actually a better way. It's called… but I didn't want to give it to you too early. Let you really practice typing carefully—the closing quotes and plus signs.
So rather than have multiple sets of quotes, you just have one set of quotes. You have no plus signs at all to connect the substrings to the variables. To go back and forth, you put an f
in front of the one pair of quotes.
So put an f
in front of everything, delete all the plus signs, delete all but the outer wrapper quotes. No need to convert nums to string. You wrap the variables in curly braces.
So let's try this. We're going to say, okay, we'll say pet
. We're going to deliberately make something that's really difficult to CONCATENATE.
So you have all these various cat variables here, right? Now we're going to use regular string concatenation and make this sentence.
We have random—what's that? Did we not grab random? Yeah, we have random, we haven't used it yet. Let's use random here.
So what are we going to do? Okay, so what we're going to do is use regular string concatenation, you know, with the plus sign. That would be the plus sign and the variables.
Make this sentence: "Meow, my name is Fluffy. I am a three-year-old cat. My favorite toy is my floppy fish."
Use a random toy each time. Choose a random toy each time. We'll see if we remember our random methods for that.
All right, so regular concatenation: we could say greeting = meow
.
Now we don't want to write "meow"—wherever there's a variable, we want to put the variable. We're going to put sound
. That's your meow.
So take out "meow" plus we want the—no, "meow" has—well, let's take out the exclamation from the sound.
Let's even take out the capitalization, because we don't know. We'll say sound.upper()
, right? We're going to capitalize the sound because it's starting the sentence.
Right, we'd want the flexibility. We'll have sound = meow
. So meow.upper()
capitalizes "meow".
That's what we're going for here, right? We'll say desired_output
.
We capitalize "meow", added the exclamation: "My name is…" Now we have to close our string and say name
, right? Name is Fluffy.
Then open up another string: "I am a blank year old."
Okay, so we have to close and now we have to add—what's that? Age. age = 3
or 4
. "I am a age."
Hmm. I'm an age. Why? Why? My name is Fluffy… what does it like? Oh, you can't CONCATENATE into string, right?
We'd have to stringify the age: "I am a five-year-old."
Now we have to close that and put pet
, right? You're a cat.
And we have to open up another substring: "My favorite toy is my…" We gotta close that.
And we could say random.choice(toys)
, that will give a random toy.
And we gotta finish the job with an exclamation.
So notice we have one, two, three, four, five, six, seven, eight, nine plus signs—and tons of inner quotes.
And it's extremely easy to mess up. It's just that I have a lot of practice doing this, otherwise I'd have even way more errors.
There we go: "Meow…" Oh, that's upper—we want to capitalize.
Okay: "I'm a four-year-old cat. My favorite toy is my bell." And if you run it every time: "my slipper, " "my floppy fish"—it runs a different item every time.
Remember, random.choice
goes into a list and grabs an item at random.
So there's your desired output—we got it—but it's very difficult to do with the concatenation.
Get it? CONCAT-a-nation? Ohhh.
All right, we're going to refactor the revcat intro using this f-formatting here.
Let's try a simpler version of f-formatting though first. This one's a lot.
Okay, we're going to say f-format instead of plus concatenation.
We'll just say, you know, first_name = …
Let's just do, like, go back to the basics.
This was the very first variable you ever declared. Let's just do this.
I'm going to make the name tag. Okay, we'll say name_tag = "Hello…"
We want to warm up with this—we don't want to do the cat thing immediately with this new method.
"Hello, my name is " + first_name
+ " " + last_name
.
I think we don't need punctuation at the end of that.
There you go, name_tag
.
If you want it to be like a real name tag, you can do a—you could do that, make multiple lines.
Regardless, the back-end gives you a line break.
Okay, let's look at the f-formatting alternative to that.
You've got your variables.
Let's just leave these alone. You'll say—I mean, we already have them, but putting them here is easier to read.
We'll put an f
in front of everything.
Look at the rules: put f
in front of everything, delete all the plus signs, delete all the…
So no plus signs, no quotes except outer wrapper quotes.
Now, if you print that—if you run that—it's going to literally output the variables.
So what we want to do is, we put the variables in curly braces.
There. So there it is refactored.
Actually, put it here… oh no, no, no, sorry.
Now do the cat one, now that we know a little bit about this f-formatting alternative to string concatenation.
Let's revisit the cat thing, which we can move down now.
All right, so we start with the regular concatenation on kind of a simple example, then we ramp it up to this cat example that's very complicated.
It's got this number that we had to stringify and all this other stuff. It's random.
Now, we're going to refactor that.
What we can do is just copy all this, and now we put in f
with quotes, and then every variable goes in curlies.
And the variable with the method—that's all part of one curly.
You take out all plus signs and interior quotes, and you wrap every single variable in curlies.
All plus signs and interior quotes go. Plus signs and interior quotes go.
You wrap variables in curlies. You do not need to stringify, as noted above as well.
Take out all interior quotes and curlies.
This really cleans it up, right? The curlies are there to identify—since you don't have quotes and plus signs separating the variables from the substrings—you need a way to say, hey, that's a variable.
That's what the curly braces are for.
All that should work, and it does.
So it's still pretty long, there's a lot going on, but it's way—look at that—it's way better than the regular concatenation, in my opinion, and pretty much every developer's opinion, I would think, who does this stuff.
All right, okay, continuing.
We're going to make a URL, a file path from news headlines—from a news headline.
So let's have a news headline: "Mystery Drone Spotted Over New Jersey."
And we're going to—this is what we want—we're going to start with a headline.
You move the expected result up, so you have an idea what the heck are we doing here.
We'll start with headline and base URL of news website.
We have this base URL—you could say, just a string to start the website path.
And then we have this news headline—just call it headline
.
And what we want is that. We want to take this base URL and then put onto it, tack onto it, the headline—except everything to lowercase, with dashes in between everything—and then we want to put on a random number, six-digit random number, and then we want to put .html
on it.
We're going to lowercase, change the file name from a headline.
We need to operate on the headline—we're not going to make a challenge out of this—we're just going to do this together right now.
Okay, so the first one is one, we're going to lowercase headline.
We'll say headline = headline.lower()
.
And if you print headline to lowercase, you got your headline.
Replace the dashes in headline—excuse me—replace all spaces in headline with dashes.
We're going to say headline = headline.replace(" ", "-")
.
We're taking our string methods for a spin here. This is what we're doing. We're challenging ourselves to apply the string methods in a use case context.
We're processing the string, right? Because we're trying to get this.
Now that we have that, we're done with the processing of the headline.
Now we need this random six-digit number.
We'll say r = random.randint(10,000,999999)
Then we would say headline—here, let's do this—let's add it on there.
We'll say headline = f"{headline}-{r}.html"
Headline with number and file extension.
There it is—we added that already—do that in one move.
CONCAT file name onto the URL.
Now it's headline = url + headline
.
We could just—now, when it's really simple concatenation, with just one plus sign…
URL from headline.
And this is kind of real—like, if you go look at news sites, you'll see that the paths on the web for articles quite often have a file name that is very close, if not exactly matching the news article’s headline.
So the URL from the headline is there—it’s actually a link to—you can click on it—it doesn’t exist, I mean, it takes you nowhere—but there it is.
So notice in this case, since it's just the one, it's just the two items being concatenated, the regular plus sign is actually easier than doing the f with the curly braces around the variable.
So here are the rules—I put them up here—there's a little hint.
You should probably put the hint when you do the integer.
Now—bonus challenge!
Now the challenge part: can you do all this in a loop?
In other words, can you run this process on multiple headlines?
If you're trying to do something kind of complicated on a loop, the best way to do it, to start, is to not try to do the loop at all.
You just pretend there is no loop, and you just pick one item that you would be doing it on, and see if you can get it to work.
If we wanted to do this procedure to process multiple headlines all into file names, right?
Then you would want to just pick one of the many headlines, and then make it work.
And then once you get it to work, which we've done so far, then you would go, okay, let me now loop all the headlines, and run all this code as the contents of the loop.
So what we've got هنا is we've got many headlines.
We'll say headlines = […]
Now, plural. And we'd like to process them into file names or like URLs, except do it on a loop, and save all the URLs.
We're going to make file names from headlines.
We're going to make a new list called web_urls
.
Here's the expected result, right? Something like that.
So what we do is we just take all this code and we run it on a loop.
We're going to say for headline in headlines:
Just paste everything. So headline being the current headline—we just operate.
We don't even need to print all this. Let's take all this stuff out and clean it up.
Right—we don't need the notes twice—the explanations.
There.
And then what do you do with the headline when you're done?
Actually, at this point, we want to take web_urls
and just append the url + headline
.
We don’t even want to call it the headline at that point.
There it is.
And then you print—or pprint
—the web URLs.
Boom, it works.
All right, that's stuff to study. It's a lot going on.
We've got a loop, we've got two string methods, we've got random—revisiting lesson three with the randint
.
We also have the random.choice
for the cat toy.
We've done plenty of review of methods, list methods, and slicing.
And we learned a brand new way to CONCATENATE without all the pesky plus signs.
That's a real bonus challenge. I'd pause and sit here and type it all out and make sure I understand it, and just try my best.
Okay, and again, just to recap, notice to tackle something this complex, and with this many steps—forget the loop.
Just pick one item in the loop to operate on, get it to work on one item in the loop, and then just loop that move—that sequence, right?
Then do it in a loop to every item—headline
representing the current headline from the list of headlines.