Decentralized data storage system development

We design and develop full-cycle blockchain solutions: from smart contract architecture to launching DeFi protocols, NFT marketplaces and crypto exchanges. Security audits, tokenomics, integration with existing infrastructure.

8+Years of workmore info 900+Completed projectsmore info 100+In house employeesmore info 19+Partnersmore info

Services we offer

Showing 1 of 1All 1306 services

Decentralized data storage system development

Complex

~1-2 weeks

Frequently Asked Questions

Blockchain Development Services

Discuss your blockchain project

Free consultation — we will show how blockchain can solve your challenge

Get a quote

We will estimate the budget and timeline for your blockchain project

Blockchain Development Stages

Latest works

B2B ADVANCE company website development
1309
Development of a web application for FEEDME
1222
Website development for BELFINGROUP
922
Development of an online store for the company FURNORO
1151
B2B Advance company logo design
614
Development of a web application for Enviok
887

Show more works

Development of Decentralized Data Storage Systems

If you're reading this after your AWS S3 bucket caused regulatory issues, or after yet another "we're upgrading our infrastructure" notice from a centralized storage provider — welcome. Decentralized data storage isn't about ideology; it's about specific properties: absence of a single point of failure, content verifiability through content addressing, and the ability to store data without an operator's permission.

In 2024, "decentralized storage" refers to three fundamentally different stacks: IPFS + Filecoin, Arweave, and Storj/Sia (incentivized distributed storage). Each suits its own class of problems.

IPFS + Filecoin: content addressing and storage economics

IPFS isn't storage; it's addressing and transport. The Content Identifier (CID) is a multihash of file contents, computed by default as SHA2-256 through DAG structure (dag-pb with UnixFS). If the file changes — the CID changes. This fundamental property makes IPFS suitable for NFT metadata, verifiable documents, immutable artifacts.

IPFS's problem: pinning. If your node stops pinning content, it disappears from the network. Filecoin adds economics: storage providers earn FIL for storing verified deals.

Working with IPFS Cluster

Production systems require IPFS Cluster — a replication coordinator across multiple IPFS nodes. Minimum configuration: 3 nodes, replication factor 2.

// Example pinning via Cluster REST API
type ClusterPinRequest struct {
    CID            string            `json:"cid"`
    ReplicationMin int               `json:"replication-min"`
    ReplicationMax int               `json:"replication-max"`
    Name           string            `json:"name"`
    Meta           map[string]string `json:"meta"`
}

func PinToCluster(cid string, name string) error {
    req := ClusterPinRequest{
        CID:            cid,
        ReplicationMin: 2,
        ReplicationMax: 3,
        Name:           name,
    }
    body, _ := json.Marshal(req)
    resp, err := http.Post(
        "http://cluster-api:9094/pins/" + cid,
        "application/json",
        bytes.NewReader(body),
    )
    // ...
    return err
}

For large file uploads, using a chunking strategy is critical. By default, IPFS uses size-262144 (256 KB chunks), but for videos and large binaries, the rabin chunker provides better deduplication through content-defined boundaries:

ipfs add --chunker=rabin-262144-524288-1048576 large_file.bin

Filecoin Storage Deals

Direct work with Filecoin via Lotus is the low-level path. For production, use Estuary (deprecated) or modern Lighthouse SDK / web3.storage:

import { Web3Storage } from 'web3.storage'

const client = new Web3Storage({ token: process.env.W3S_TOKEN })

async function storeWithReplication(files: File[]): Promise<string> {
    const cid = await client.put(files, {
        wrapWithDirectory: false,
        onRootCidReady: (rootCid) => {
            console.log('Root CID:', rootCid) // available before upload completes
        },
        onStoredChunk: (size) => {
            console.log(`Uploaded chunk of ${size} bytes`)
        }
    })
    return cid
}

web3.storage does hot IPFS pinning + cold Filecoin deal automatically. For enterprise — NFT.Storage with similar API but focused on metadata.

Arweave: permanent storage with one-time payment

Arweave offers a different model: pay once, data stored "forever" (endowment fund designed for 200+ years with conservative storage cost assumptions). This fundamentally changes use cases.

When Arweave is the right choice:

Smart contract source code and ABI (permanent verifiability)
NFT metadata and media (avoid NFT rot)
Legal and notarial documents
Governance protocols and voting results (DAO governance history)

Data in Arweave is a transaction with a data field and tags. Tags are the key to indexing via GraphQL:

import Arweave from 'arweave'
import { WarpFactory } from 'warp-contracts'

const arweave = Arweave.init({
    host: 'arweave.net',
    port: 443,
    protocol: 'https'
})

async function uploadDocument(data: Buffer, mimeType: string, metadata: Record<string, string>) {
    const tx = await arweave.createTransaction({ data })

    tx.addTag('Content-Type', mimeType)
    tx.addTag('App-Name', 'YourDApp')
    tx.addTag('Version', '1.0.0')

    // Custom tags for GraphQL search
    for (const [key, value] of Object.entries(metadata)) {
        tx.addTag(key, value)
    }

    await arweave.transactions.sign(tx, jwk)
    const response = await arweave.transactions.post(tx)

    return tx.id // permanent document ID
}

Query Arweave via ArQL / GraphQL to search documents by tags:

query FindDocuments($owner: String!, $docType: String!) {
  transactions(
    owners: [$owner]
    tags: [
      { name: "App-Name", values: ["YourDApp"] }
      { name: "Document-Type", values: [$docType] }
    ]
    first: 100
  ) {
    edges {
      node {
        id
        tags { name value }
        block { timestamp }
      }
    }
  }
}

Bundlr / Irys: batch upload and instant finality

Native Arweave transactions confirm in ~2 minutes, problematic for UX. Irys (formerly Bundlr) solves this via layer 2 over Arweave: transactions confirm instantly, batch, and post to Arweave.

import Irys from '@irys/sdk'

const irys = new Irys({
    url: 'https://node1.irys.xyz',
    token: 'ethereum',
    key: privateKey,
})

// Check cost before upload
const price = await irys.getPrice(data.length)
console.log(`Cost: ${irys.utils.fromAtomic(price)} ETH`)

// Upload with tags
const receipt = await irys.upload(data, {
    tags: [
        { name: 'Content-Type', value: 'application/json' },
        { name: 'Contract-Address', value: contractAddress },
    ]
})
// receipt.id — TXID, available immediately at https://gateway.irys.xyz/{id}

Hybrid architecture: hot + cold storage

Real systems rarely use just one protocol. Typical architecture for DApps with performance and permanence requirements:

Data Write:
User → App Backend → [in parallel]:
    1. IPFS Cluster (hot, fast access, ~3 replicas)
    2. Irys → Arweave (cold, permanent, 1-2 min)
    3. PostgreSQL (CID + Arweave TXID + metadata, for search)

Data Read:
User → App Backend → PostgreSQL (lookup CID/TXID)
    → IPFS Gateway (fast if pinned)
    → Fallback: Arweave Gateway (if IPFS unavailable)

This provides instant reads via IPFS, permanence guarantees via Arweave, and search via regular database.

Integrity verification

Content addressing provides built-in verification for IPFS: CID is the hash of contents. For Arweave, verification via transaction proofs:

async function verifyArweaveData(txId: string, expectedHash: string): Promise<boolean> {
    const tx = await arweave.transactions.get(txId)
    const data = await arweave.transactions.getData(txId, { decode: true })
    const hash = crypto.createHash('sha256').update(data as Buffer).digest('hex')
    return hash === expectedHash
}

Encryption and access control

Decentralized storage doesn't mean public. For sensitive data — Lit Protocol for threshold encryption with on-chain access conditions:

import * as LitJsSdk from '@lit-protocol/lit-node-client'

// Access condition: NFT owner from collection
const accessControlConditions = [{
    contractAddress: NFT_CONTRACT,
    standardContractType: 'ERC721',
    chain: 'ethereum',
    method: 'balanceOf',
    parameters: [':userAddress'],
    returnValueTest: { comparator: '>', value: '0' }
}]

// Encrypt before uploading to IPFS
const { ciphertext, dataToEncryptHash } = await LitJsSdk.encryptString(
    { accessControlConditions, dataToEncrypt: sensitiveData },
    litNodeClient
)

// Decrypt — only if on-chain condition is met
const decrypted = await LitJsSdk.decryptToString(
    { accessControlConditions, ciphertext, dataToEncryptHash, chain: 'ethereum' },
    litNodeClient
)

Cost model and stack selection

Criterion	IPFS + Cluster	Filecoin Deals	Arweave / Irys	Storj
Permanence	While pinned	1–5 years per deal	Permanent	While paid
Read speed	High	Slow (retrieval)	Medium	High
Cost	Infrastructure	~$0.0001/GB/month	~$10/GB one-time	~$4/TB/month
Censorship resistance	Medium	High	Very high	Medium
Search	CID only	No	GraphQL by tags	No

For NFT projects and DAOs: Arweave via Irys is the only sensible choice — metadata costs are negligible, permanence is critical. For application data with high latency requirements: IPFS Cluster with hot replication + Filecoin for archive copies. For enterprise with compliance requirements: hybrid scheme with Lit Protocol encryption.