Discover how to protect your confidential data when using large language models like ChatGPT. Learn the quick steps to secure your privacy and prevent accidental information leaks.
Key Insights
- Recognize that large language models, when trained on proprietary or confidential business information, risk unintentionally revealing sensitive data to others.
- Disable data-sharing in ChatGPT by accessing your profile, navigating to settings, selecting "Data Controls," and turning off "Improve the model for everyone."
- Understand that free and individual paid users have data-sharing enabled by default, whereas business or enterprise accounts have data-sharing turned off by default.
Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.
So, let’s talk about the data privacy issue. Large language models ingest vast amounts of knowledge and learn from that data—they want to learn from as much as possible. The more they can train on, the better they can be. But if they train on proprietary or confidential business information, they could learn something that they might then potentially reveal to someone else. You want to be very careful to avoid leaking private information. So, those business plans—the Team and Enterprise plans—exclude data from training by default. They don’t have to do anything to opt out. But free users do.
And even I, as a Plus user, even though I pay, I have to turn off data training manually. So, why don’t we all turn off data training right now? My thought is that whether I’m on a free plan or not, I don’t want them to train on anything I say, in case I share something private.
So, in the profile icon in the top right corner when you're logged into your ChatGPT account, click that and go to Settings. Under the “Data Controls” section, you’ll see an option that says “Improve the model for everyone.”
No, I don’t. Yours is probably on by default. And I mean, you don’t have to disable it—if you want to help improve the model for everyone, you can leave it on. But mine was on, and I turned it off.
That way, if you share anything, they will not use or repeat your information to someone else. They will not learn from your data. So, I would recommend disabling model improvement.
If you turn that off, then your chats from now on will not be used for training. Now, if you've already done chats and told it things, those could have been incorporated into the model. But from now on, it should not be learning based on anything you share.
Now, granted, it is a matter of trusting the company, but from everything that I know and have heard, there's no reason to distrust them.