Decentralized data storage system development

We design and develop full-cycle blockchain solutions: from smart contract architecture to launching DeFi protocols, NFT marketplaces and crypto exchanges. Security audits, tokenomics, integration with existing infrastructure.
Showing 1 of 1 servicesAll 1306 services
Decentralized data storage system development
Complex
~1-2 weeks
FAQ
Blockchain Development Services
Blockchain Development Stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1217
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1161
  • image_websites_belfingroup_462_0.webp
    Website development for BELFINGROUP
    852
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1046
  • image_logo-advance_0.png
    B2B Advance company logo design
    561
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    823

Development of Decentralized Data Storage Systems

If you're reading this after your AWS S3 bucket caused regulatory issues, or after yet another "we're upgrading our infrastructure" notice from a centralized storage provider — welcome. Decentralized data storage isn't about ideology; it's about specific properties: absence of a single point of failure, content verifiability through content addressing, and the ability to store data without an operator's permission.

In 2024, "decentralized storage" refers to three fundamentally different stacks: IPFS + Filecoin, Arweave, and Storj/Sia (incentivized distributed storage). Each suits its own class of problems.

IPFS + Filecoin: content addressing and storage economics

IPFS isn't storage; it's addressing and transport. The Content Identifier (CID) is a multihash of file contents, computed by default as SHA2-256 through DAG structure (dag-pb with UnixFS). If the file changes — the CID changes. This fundamental property makes IPFS suitable for NFT metadata, verifiable documents, immutable artifacts.

IPFS's problem: pinning. If your node stops pinning content, it disappears from the network. Filecoin adds economics: storage providers earn FIL for storing verified deals.

Working with IPFS Cluster

Production systems require IPFS Cluster — a replication coordinator across multiple IPFS nodes. Minimum configuration: 3 nodes, replication factor 2.

// Example pinning via Cluster REST API
type ClusterPinRequest struct {
    CID            string            `json:"cid"`
    ReplicationMin int               `json:"replication-min"`
    ReplicationMax int               `json:"replication-max"`
    Name           string            `json:"name"`
    Meta           map[string]string `json:"meta"`
}

func PinToCluster(cid string, name string) error {
    req := ClusterPinRequest{
        CID:            cid,
        ReplicationMin: 2,
        ReplicationMax: 3,
        Name:           name,
    }
    body, _ := json.Marshal(req)
    resp, err := http.Post(
        "http://cluster-api:9094/pins/" + cid,
        "application/json",
        bytes.NewReader(body),
    )
    // ...
    return err
}

For large file uploads, using a chunking strategy is critical. By default, IPFS uses size-262144 (256 KB chunks), but for videos and large binaries, the rabin chunker provides better deduplication through content-defined boundaries:

ipfs add --chunker=rabin-262144-524288-1048576 large_file.bin

Filecoin Storage Deals

Direct work with Filecoin via Lotus is the low-level path. For production, use Estuary (deprecated) or modern Lighthouse SDK / web3.storage:

import { Web3Storage } from 'web3.storage'

const client = new Web3Storage({ token: process.env.W3S_TOKEN })

async function storeWithReplication(files: File[]): Promise<string> {
    const cid = await client.put(files, {
        wrapWithDirectory: false,
        onRootCidReady: (rootCid) => {
            console.log('Root CID:', rootCid) // available before upload completes
        },
        onStoredChunk: (size) => {
            console.log(`Uploaded chunk of ${size} bytes`)
        }
    })
    return cid
}

web3.storage does hot IPFS pinning + cold Filecoin deal automatically. For enterprise — NFT.Storage with similar API but focused on metadata.

Arweave: permanent storage with one-time payment

Arweave offers a different model: pay once, data stored "forever" (endowment fund designed for 200+ years with conservative storage cost assumptions). This fundamentally changes use cases.

When Arweave is the right choice:

  • Smart contract source code and ABI (permanent verifiability)
  • NFT metadata and media (avoid NFT rot)
  • Legal and notarial documents
  • Governance protocols and voting results (DAO governance history)

Data in Arweave is a transaction with a data field and tags. Tags are the key to indexing via GraphQL:

import Arweave from 'arweave'
import { WarpFactory } from 'warp-contracts'

const arweave = Arweave.init({
    host: 'arweave.net',
    port: 443,
    protocol: 'https'
})

async function uploadDocument(data: Buffer, mimeType: string, metadata: Record<string, string>) {
    const tx = await arweave.createTransaction({ data })

    tx.addTag('Content-Type', mimeType)
    tx.addTag('App-Name', 'YourDApp')
    tx.addTag('Version', '1.0.0')

    // Custom tags for GraphQL search
    for (const [key, value] of Object.entries(metadata)) {
        tx.addTag(key, value)
    }

    await arweave.transactions.sign(tx, jwk)
    const response = await arweave.transactions.post(tx)

    return tx.id // permanent document ID
}

Query Arweave via ArQL / GraphQL to search documents by tags:

query FindDocuments($owner: String!, $docType: String!) {
  transactions(
    owners: [$owner]
    tags: [
      { name: "App-Name", values: ["YourDApp"] }
      { name: "Document-Type", values: [$docType] }
    ]
    first: 100
  ) {
    edges {
      node {
        id
        tags { name value }
        block { timestamp }
      }
    }
  }
}

Bundlr / Irys: batch upload and instant finality

Native Arweave transactions confirm in ~2 minutes, problematic for UX. Irys (formerly Bundlr) solves this via layer 2 over Arweave: transactions confirm instantly, batch, and post to Arweave.

import Irys from '@irys/sdk'

const irys = new Irys({
    url: 'https://node1.irys.xyz',
    token: 'ethereum',
    key: privateKey,
})

// Check cost before upload
const price = await irys.getPrice(data.length)
console.log(`Cost: ${irys.utils.fromAtomic(price)} ETH`)

// Upload with tags
const receipt = await irys.upload(data, {
    tags: [
        { name: 'Content-Type', value: 'application/json' },
        { name: 'Contract-Address', value: contractAddress },
    ]
})
// receipt.id — TXID, available immediately at https://gateway.irys.xyz/{id}

Hybrid architecture: hot + cold storage

Real systems rarely use just one protocol. Typical architecture for DApps with performance and permanence requirements:

Data Write:
User → App Backend → [in parallel]:
    1. IPFS Cluster (hot, fast access, ~3 replicas)
    2. Irys → Arweave (cold, permanent, 1-2 min)
    3. PostgreSQL (CID + Arweave TXID + metadata, for search)

Data Read:
User → App Backend → PostgreSQL (lookup CID/TXID)
    → IPFS Gateway (fast if pinned)
    → Fallback: Arweave Gateway (if IPFS unavailable)

This provides instant reads via IPFS, permanence guarantees via Arweave, and search via regular database.

Integrity verification

Content addressing provides built-in verification for IPFS: CID is the hash of contents. For Arweave, verification via transaction proofs:

async function verifyArweaveData(txId: string, expectedHash: string): Promise<boolean> {
    const tx = await arweave.transactions.get(txId)
    const data = await arweave.transactions.getData(txId, { decode: true })
    const hash = crypto.createHash('sha256').update(data as Buffer).digest('hex')
    return hash === expectedHash
}

Encryption and access control

Decentralized storage doesn't mean public. For sensitive data — Lit Protocol for threshold encryption with on-chain access conditions:

import * as LitJsSdk from '@lit-protocol/lit-node-client'

// Access condition: NFT owner from collection
const accessControlConditions = [{
    contractAddress: NFT_CONTRACT,
    standardContractType: 'ERC721',
    chain: 'ethereum',
    method: 'balanceOf',
    parameters: [':userAddress'],
    returnValueTest: { comparator: '>', value: '0' }
}]

// Encrypt before uploading to IPFS
const { ciphertext, dataToEncryptHash } = await LitJsSdk.encryptString(
    { accessControlConditions, dataToEncrypt: sensitiveData },
    litNodeClient
)

// Decrypt — only if on-chain condition is met
const decrypted = await LitJsSdk.decryptToString(
    { accessControlConditions, ciphertext, dataToEncryptHash, chain: 'ethereum' },
    litNodeClient
)

Cost model and stack selection

Criterion IPFS + Cluster Filecoin Deals Arweave / Irys Storj
Permanence While pinned 1–5 years per deal Permanent While paid
Read speed High Slow (retrieval) Medium High
Cost Infrastructure ~$0.0001/GB/month ~$10/GB one-time ~$4/TB/month
Censorship resistance Medium High Very high Medium
Search CID only No GraphQL by tags No

For NFT projects and DAOs: Arweave via Irys is the only sensible choice — metadata costs are negligible, permanence is critical. For application data with high latency requirements: IPFS Cluster with hot replication + Filecoin for archive copies. For enterprise with compliance requirements: hybrid scheme with Lit Protocol encryption.