Data exfiltration is a concern for most organizations. Protecting your data from prying eyes is hard enough but keeping it on your network; now that’s a challenge. With technology continuing to advance, we are forever moving to cloud this and cloud that. Because we can’t live in a world with only using one company, our data is scattered around the web like nobodies business.
For personal data, we understand it’s a losing battle. Companies saying you can’t use their service unless you let them sell information such as the fact you like marmite on crumpets. It’s like being back at school where the person who brought the ball in went home because they didn’t get picked.
For enterprises and large businesses, this nightmare is tripled. A business needs to allow its data to be accessible in order to be more productive. They also need to collaborate with other companies which means sharing of that data. With all this data moving left and right, how do you secure it?
Data protection tools such as Azure AIP, Varonis or Netwrix will help, but it will take a lot of time and a lot of money. Unless you classify everything as confidential, you are still going to have exposed data. Solutions like these also depend on manual intervention from users, and as humans we make mistakes. That mistake could be not classifying that payroll spreadsheet for instance or an admin not configuring the platform correctly.
I think to truly get in a good place, you implement multiple solutions tailored to how you are wanting to secure the data, and if not limiting access, at best auditing. Target the most important and attempt to secure the rest. A lot of it is also about accepting the risk. Most controls are targeting one or two threats, but with exfiltration, there are multiple ways in and out. We must understand that.
A starting point for all this, could be listing out the different data types or sources. Sure, having data classification works for documents but doesn’t work for raw data. This is why I feel tailored “policy” or controls is key to preventing your data being stolen or at least the data you care about.
Below are a few data types that come to mind:
- Documents (Spreadsheets, word documents, PDFs etc…)
- RAW data (Database data, API returns, Text file etc…)
- Pictures (JPEG, Diagrams etc…)
- Video (Session recordings, meetings, live stream etc…)
- Sounds (Voice memos, recordings etc…)
- Compressed (Zip files, rar files etc…)
After this, we would have to look at how you obtain this data. For example, securing the database is crucial however, if a webapp allows custom queries, you’ve allowed a method of extraction. If that web app is internet facing, there is a greater attack surface.
This will be a never-ending exercise but going down this road may flag up some concerns that weren’t there before. It also makes you re-evaluate how you’ve done it before as there could be more holes in your bucket than you realized.
Now that we know where our data is and how to get at it, we can look at how an attacker could steal it.
Below is a table I’ve created that acts as a baseline for data exfiltration.
|Email can be used to forward data: in body, attachment, or links.||Prevent and monitor forwarding however, controlling where you can send to would be hard to manage. If concerns are made, targeted policy can be applied to block recipient domain.|
|Email (Offline)||Emails can be downloaded to a device for offline viewing.||Set data policies at the application/data layer, so that it isn’t dependant on internet connections.|
|Email (Legacy Protocols)||Emails can be downloaded using these protocols bypassing security controls||Block legacy protocols.|
|USB or removable drives (CD included)||Data can be copied to removal devices. This includes CDs which can sometimes be missed by controls focused at USB level.||Remove drivers if CDs aren’t in use. Block USB access, limit to company USB or force encryption.|
|Mapped drives (Personal; RDP)||Remote tools allow drive mappings from the users local machine towards the client/server. This allows for data to be accessed/copied.||Block mapping of drives by default if the tech allows it.|
|Remote Tool Access||Remote tools such as TeamViewer allow transfer of data, either by data or chat.||Block all non-approved remote tools, including download websites.|
|Clipboard extraction||Copying a user clipboard, transfer from client to client over clipboard (Text, data) and clipboard Hjacking (Bad browser extensions, bad apps, websites)||Block clipboard when technology allows it. Block or manage browser extensions.|
|Internet upload (Alternative ports)||Users can upload to SFTP, open SMB, and alterantive ports to extract data (Inc C2).||Block Internet access on ports not used. Look to defined URL/IP rules over open. Endpoint can help prevent against C2, however local tools can allow upload (WinSCP)|
|Cloud upload||Users can upload to cloud hosting sites||Limit or block unapproved cloud sites.|
|Cloud sharing||Users can share direct links to externals/ others to have direct access to data.||Limit sharing capabilities when possible, and enforce limitation using AIP|
|Cloud app (syncing; offline)||Onedrive and other applications can create offline sync.||Set data policies at the application/data layer, so that it isn’t dependant on internet connections.|
|Exposed data (Internet facing)||Exposed SFTP, SMB shares and other services to the internet can allow data to be extracted. This includes the famous AWS Buckets that continue to be breached.||Limit inbound access on Firewalls, reduce exposure of certain ports, strong ACLs, Strong vulnerability management process|
|Stealing devices (Hard disks)||If not encrypted, MDM or Internet access it data can be stolen on the device.||Wiping options if MDM (requires Internet), strong policy against stolen devices, encrypt data always (stops transfer but not reading).|
|Public repos||Public repos such as GitHub may be used to extract data.||Public repos can be seen as outside cloud hosting sites, but do have the same functionality to upload data.|
|Website download/upload||The company has a site that allows downloading of data/ The attacker creates a site to upload data (not publicly known)||Review Internet facing sites, securing any with strong U/P.Ensure no senstive data is stored on our company sites. Block ports that aren’t needed, and monitor POST requests (Upload).|
|LAN (Open shares)||If an attack gains hold of a laptop, they could transfer the file over LAN to their machine using open SMB/Shares||Strong username and password should limit users breaking in, if lost/stolen. Idle timeout, plays a part. Limit or block SMB transfer on public networks|
|LAN (Cross-over cables)||If an attack has access to the system, they can create a local network, and map to the device.||Strong username and password should limit users breaking in, if lost/stolen. Idle timeout, plays a part.|
|Live Streaming||Attacks live stream the data, so that they don’t have to actually move/copy the data. This being done over teams, or console session. This includes taking photos, videos on camera or snippets.||This bypasses most controls as have been seen in the wild. Reducing access to the systems and data is the key control.|
|Tunnelling||Creating an encrypted tunnel to the attacks network making it hard to secure/block.||Limiting tools allows this and block/limiting internet access is key here.|
|Bluetooth||Bluetooth is present on most devices and is hard to control.||Turn of Bluetooth when not required, limiting to it when needed. Bluetooth has weaknesses allowing data theft.|
|Mobile Apps||Mobile apps can access data on the device and mobile phones.||Mobile phones have less controls if not MDM enabled. Information protection tools will help prevent this as well.|
This can be padded out with the help of the Mitre framework: Exfiltration, Tactic TA0010 – Enterprise | MITRE ATT&CK®
These are just a few techniques that can be used to extract data, but they are only the common ones. Attackers will continue to find ways to steal the data in order to hold you ransom.
The controls you put in, are never 100% and when you do implement them, you should ask yourself “ok but what if…”. You will then gather around 90% of all use cases and the other 10% will most likely surprise you.
Attackers have a lot more time and resource then you, so will come up with clever ways. I had read a while back of an attacker who breached a company and found themselves unable to extract the data from the network. Instead, to get around this, they streamed their session while they read the data so that they technically had a copy. Sure, it’s not the original but often it’s the data or content that is the gold mine. With this, they had exfiltrated the data through a means that the company couldn’t secure. It’s the whole, you can’t screenshot your banking app, but you can just take a picture of it. Specific control; ways around it. It’s about being aware that will help you in this fight.