Extending DSPM to NAS: One Platform for Cloud, SaaS & On-Prem

Network-attached storage holds more sensitive unstructured data than any other system in the enterprise. Business-critical processes still run on those shares every day, and now teams are racing to put that data to work with AI. Before they can, they need to know what is on it.

Key takeaways

NAS holds most enterprise unstructured data, and AI pipelines now turn its dormant permission gaps into live exposures that any user can retrieve.
Bedrock Data extends DSPM to NAS with an outpost that classifies file content locally; only metadata leaves, so file bytes never exit your environment.
The outpost reads SMB, NFSv3, and NFSv4, normalizes all ACLs into a single permission graph, and self-throttles to scan production filers without affecting their workloads.
NAS, SaaS, and cloud data land in a single Metadata Lake under a single classification and access scheme, supporting HIPAA, GDPR, and SOC 2 evidence, among other use cases.
Classifying NAS before AI ingestion or cloud migration lets you exclude sensitive directories from RAG indexes and remediate exposures before they cause harm.

This post covers why NAS is back on the security agenda, why file shares are harder to scan than cloud data sources, and how Bedrock Data extends DSPM coverage to NAS, alongside cloud and SaaS, without moving data out of your environment.

NAS is back on the security agenda, and AI is the reason

For a decade, network attached storage has sat in a quiet corner of the security backlog. The data on those shares is older than most of the governance programs around it, and storage teams maintain it rather than data security teams. The data stayed just as sensitive the whole time, yet it was easy to defer. Cloud object stores, SaaS platforms, and data warehouses produced more visible risk per square inch of attention, and that is where DSPM programs went first.

That deferral is ending, and the reason is AI. Two pressures are forcing NAS back onto the agenda.

Pressure 1: AI pipelines surface dormant ACL exposures

NAS used to be hard to misuse, because finding a sensitive file meant knowing where it lived and having access to it. Obscurity protected the data as much as access control did. The IP address, the credentials in a 2018 backup directory, the contractor's home directory readable by a stale group: all reachable in principle, none findable in practice. AI strips away the obscurity and drags the access control along with it. A typical AI data pipeline crawls content under one connector identity, embeds it, and stores it in an index that keeps no trace of the file's original ACL. Unless the system filters retrieval by the asking user's own entitlements, the copilot answers from a file that the user could never have opened. The exposure was always sitting in the ACL; the pipeline simply made it reachable by anyone. The assistant will answer, turning every dormant exposure into an active one.

Pressure 2: Migration moves unclassified data into the cloud

Enterprises are moving NAS data to the cloud so AI tools can reach it through native connectors rather than on-prem-only protocols, a move bound by petabyte-scale shares and data residency rules in regulated environments. Teams cannot do it safely without first knowing what is on the shares. Moving content into the cloud without classifying it first lands every existing exposure on the cloud side intact, where the AI tools that triggered the migration are waiting to surface it.

Why has NAS stayed dark

Three constraints have kept NAS outside the perimeter of most data security programs. None of them is a technology problem. Each is an operational reality that any approach has to handle before it can deploy at all.

Protocols and permissions are heterogeneous

A single enterprise typically runs SMB on Windows-facing shares, NFSv3 on legacy Unix workloads, and NFSv4 on Kerberos-secured Unix workloads side by side, often on the same physical filer, sometimes with cloud-tiered gateways like Nasuni or Panzura in front.

Each protocol carries its own access-control vocabulary, whether POSIX mode bits, NFSv4 ACLs, or Windows DACLs. They encode subjects differently, enumerate actions differently, and handle allow versus deny differently. A data governance solution has to read all of them and translate the permissions into a common representation without losing the source fidelity along the way.

Crawling and throttling have to be intelligent

Tens to hundreds of millions of files per share is normal, and the right scan rate on a high-end NetApp AFF differs from the right rate on a midrange Pure FlashBlade or a commodity NAS on spinning disk. The hardware profile decides how much the share can absorb without slowing the applications and users that depend on it. A data governance solution has to tune itself to each one, because without that, scanning becomes a denial-of-service attack with extra steps.

Data cannot leave the building

Legal, regulated, and sovereign workloads cannot let file bytes leave the customer environment. Any architecture that ships content out to a cloud classifier for processing is a non-starter.

How Bedrock Data's NAS outpost is built

The Bedrock Data NAS outpost runs as a self-contained appliance inside your network. It reads NAS shares directly over SMB, NFSv3, and NFSv4 using the authentication methods enterprise environments rely on (NTLM, Kerberos, AUTH_SYS), captures the underlying ACL state exactly as the filer has written it, classifies content locally, and sends a tightly scoped metadata stream back to the Bedrock Data control plane. File bytes never leave your environment.

What does leave is a defined set of metadata:

Basic file metadata, such as filename, file size, and file type
Classification outcomes, meaning categorical labels (PII, PHI, PCI, secrets, customer-defined tags), along with counts and confidence scores for each file
The Bedrock Data-normalized permission graph, showing who has access to what, derived from the ACLs the outpost reads locally and expressed in the same identity and access representation Bedrock Data uses across the rest of the platform
Operational telemetry, covering scan progress, error counts, and throughput

In practice, the outpost covers the systems that show up in enterprise environments: NetApp ONTAP, Dell PowerScale, Pure FlashBlade, Nutanix Files, Windows file shares, Nasuni-fronted volumes, and commodity NFS and SMB servers.

Deployment and operation

The customer-facing surface splits cleanly across two places. The outpost lives in your on-prem network, where administrators configure shares. The Bedrock Data’s UI lives in the cloud, where they review classification results and the access graph. Anything that touches your filer credentials or reads file content stays on your side, so your data stays in your environment and never reaches the cloud.

Deployment follows four steps:

Deploy the outpost: Deploy the Bedrock Data outpost once, into your own network, sized for the shared footprint it will cover. It runs alongside the rest of your infrastructure.
Open the outpost console: An administrator opens the outpost console, a browser-based UI hosted on the outpost itself and reachable from inside your network. The console is where you tell the outpost what to do, and it stays isolated from Bedrock Data's cloud.
Add a share: Onboarding a share is a single workflow in the console. The administrator specifies the details (the protocol, the host, the export or share path, and so on) that the outpost should use to connect to the filer. Administrators add and remove shares from the same view.
Run scans and review results: The outpost runs scans periodically once the credentials are set. When a scan completes, classification results and the access graph for the scanned shares land in Bedrock Data's Metadata Lake, reachable through the UI and the API.

How Bedrock Data classifies data on NAS

Trained language models do the recognition work inside your network, so file content never leaves the boundary. The Bedrock Data on-premise outpost ships with the same models Bedrock Data runs for cloud-native data sources, so classification stays consistent across every environment. Categories cover PII, PHI, PCI, secrets, intellectual property, and customer-defined classifications, with multilingual coverage. This is especially relevant for the enterprise: accurate PII and PHI classification is the foundation of HIPAA, GDPR, and SOC 2 evidence. At legal, medical, and finance firms governing NAS with Bedrock Data, Bedrock Data has found and helped remove sensitive data, including SSNs and cleartext passwords, that earlier DSPM scans had missed, in a fraction of the time.

How Bedrock Data scans NAS without disrupting production

The outpost self-throttles to match the performance profile of each filer it scans. The right concurrency for a high-end NetApp share differs from the right concurrency for a midrange filer in a branch office: aggressive enough to finish scans inside your discovery SLAs on the shares that can handle it, conservative enough to keep the safety margin intact on the shares that cannot.

NAS, SaaS, and infrastructure data in one place

NAS is the last large data source most governance programs cannot see, and scanning it is only worth the effort if the result lives where the rest of your data classifications already do.

A governance program works best when it applies consistently across the business. Bedrock Data for NAS lets you govern NAS from the same control plane you use for cloud-native sources, so the same controls (data retention policies, access rules for sensitive data) apply wherever the data lives. An engineering file on a NetApp share, a Confluence page, and a Snowflake table show up in one inventory under a single access and classification scheme in Bedrock Data's Metadata Lake. When data moves from NAS to S3, Bedrock Data links the replicated files and surfaces the movement in either direction.

What this looks like in practice

Engineering access review on a regulated share

A quarterly access review on an engineering NAS volume holding source code, design documents, and pre-release product specifications starts from the platform's inventory: classification of the contents on one side, the captured ACL state on the other, joined to your directory. Starting there turns the review into a focused exercise instead of a slog through raw ACL exports. For teams under SOC 2 or ISO 27001, it gives you the classified, access-mapped starting point that those periodic access reviews depend on. Your reviewer still makes the call, but on data that shows which access actually matters.

Onboarding a NAS during an acquisition

Bedrock Data can scan a freshly acquired company's filer the same day. Within the week, classification coverage and an access map for the acquired share land in the parent company's governance view. Skip that step, and the acquiring company inherits unknown risk: acquired shares routinely carry PII, PCI records, and intellectual property that nobody on the parent side can see or govern.

AI and RAG readiness on a file share

Before you point a retrieval pipeline at an engineering share, the platform lists the directories holding PII, PCI, secrets, or customer-defined restricted categories, and those directories stay out of the index. When unclassified NAS data is ingested into a RAG index, the model can surface whatever it retrieves, regardless of who was originally allowed to open the file, turning a dormant file-share permission into a live answer for any user who asks. The files most at risk are the ones that accumulate quietly on shares: HR records, financial spreadsheets, source code, scanned PDFs of contracts. Because Bedrock Data classifies continuously rather than once, newly added or modified files are re-evaluated, so the exclusion list stays accurate.

Managing data ROT for security post-Mythos

In a post-Mythos world, where attackers breach and exfiltrate data more easily than before, managing your blast radius matters as much as faster detection and response. Bedrock Data's Metadata Lake identifies and deletes data ROT (redundant, obsolete, and trivial), shrinking the regulatory, compliance, and security blast radii of your data. That payoff is largest on big NAS workloads that have accumulated files for years. The risk is measurable: IBM's Cost of a Data Breach Report 2024 found that 35% of breaches involved shadow data in unmanaged sources, and those breaches cost 16% more and took 26% longer to identify than breaches without it.

Closing

NAS is the data source most enterprises have hoped to ignore long enough that someone else would solve it. AI has shortened that timeline. Something will reach the data on those shares, either a migration that lifts it into the cloud or an agent handed a route to where it sits. In both cases, the prerequisite is the same: you have to know what is on the shares and who can reach it, so you can govern how AI uses it.

Welcome to all of your data, in one place. Welcome to Bedrock Data.

See what's on your file shares before AI does. Read the Bedrock Data for NAS & File Shares solution brief to see how the outpost discovers, classifies, and maps access to sensitive data on NAS, entirely inside your environment. Or book a demo with us today.

FAQs

What is DSPM for NAS?

DSPM for NAS applies data security posture management to network attached storage and file shares, the same discovery, classification, and access analysis used for cloud data. It answers what sensitive data sits on your shares and who can reach it, so you can govern that data.

Does Bedrock Data store file contents from NAS?

No. The Bedrock Data outpost runs inside your network, reads and classifies file content locally, and sends only metadata to the control plane: filenames, classification labels with counts and confidence, the normalized permission graph, and scan telemetry. File bytes never leave your environment, which is what makes the approach viable for regulated and sovereign workloads.

What types of sensitive data does Bedrock Data find on NAS?

Bedrock Data classifies the regulated and high-risk data that accumulates on file shares: PII, PHI, and PCI. It also surfaces the things that shouldn't be on a share at all, such as cleartext passwords and API keys. Detection extends to any category you define with your own rules or examples, so coverage isn't capped at a fixed list. That classification is the data inventory on which HIPAA, GDPR, PCI DSS, and SOC 2 programs build their evidence.

Which NAS systems does Bedrock Data support?

The outpost covers the systems common in enterprise environments: NetApp ONTAP, Dell PowerScale, Pure FlashBlade, Nutanix Files, Windows file shares, Nasuni-fronted volumes, and commodity NFS and SMB servers. It connects over SMB, NFSv3, and NFSv4 using NTLM, Kerberos, and AUTH_SYS authentication.

Extending DSPM to NAS: One Platform Across Cloud, SaaS, and On-Prem

NAS is back on the security agenda, and AI is the reason

Pressure 1: AI pipelines surface dormant ACL exposures