"We are not what we know but what we are ready to learn" || Industrial Engineer turned data analyst turning Blockchain Developer
27907 words
https://r1oga.now.sh/@r1oga

Talent Stack

Scott Adams Talent Stack

Most of the time, excellence and greatness are understood as specific: "the fatest...", "the best x-player", "the best in x discipline...", "the expert in field x...".
So the obvious way to become valuable is specialization. This requires a lot a discipline, time, patience and personal drive.
Scott Adams Talent Stack offers a different path. It says that even if your skill level is mediocre, if the mix of skills is right, you can become unique and valuable too.
This path requires is easier to take. The Talent Stack concept helps explaining success in cases where observers might describe it as "surprising".

Examples

Kanye West

I am quite into Hip Hop / RnB music. Although I love his music, I can come up with plenty of other male artists that excel him in specific skills.
But his unique mix of skills lead to his sucess.

Skill Kanye's level not as good as
Rapping ok Eminem, Twista, Yelawolf, Tech 9
Vocoder ok T Pain, Zapp & Roger
Singing can't Timberlake, John Legend, Usher, Chris Brown
Writing/Lyrics ok Andre 3000, 2Pac
Composing/Beat Making ok Madlib, Metro Boomin, Mike Will Made It
Dancing can't Usher, Timberlake, Chris Brown
Business acumen ok Buffet
Social Media Presence good Obama, Rihanna, Bieber
Self promotion ok ?
Fashion ok Beckham?

So he is the best in none of his skills. Most of the time he's good or just good enough. It is in combining all his skills together that he succeeded in becoming so unique.
Same goes for his wife Kim Kardashian West. I've always been somewhat astonished by her rising sucess.

K.Kardashian

Skill
Branding & Marketing Obviously good, she is her own brand
Media Knows how to deal with press, reporters, interviews...
Esthetic Not the most beautiful, but pretty enough
Network Leveraged the connections she built during/after the Ray J sextape episode
Engagement on social media She is maybe not THE best at interacting with and engaging fans online, but she can do it and does it efficiently
Family Spread success to her family
Risk management The sex tape episode was definitely a bold and carefully calculated move...
Confidence
Recruiting She partners with the right people in any industry: fashion, music, media...

Each of single skill alone is not enough. But the combination...
This is how a women went from sex tape to a $377M net worth: the right talent stack.

An further application and illustration of this concept is what Naval Ravikant describes as an "unstoppable" skill set:

The unstoppable skill set: Build & Sell

Naval notes that every successful company, individual or team needs to be good at 2 categories of skills: building and selling.

  • Building: R&D, design, engineering, coding, manufacturing, delivering...
  • Selling: Sales, marketing, communicating, recruiting, PR...

Let's imagine an excellent engineer building amazing products. If he can't market himself or attract customers, nobody will ever learn about how great his products are. No customers. No sales. Engineering requiring lot of focus, time and ressources, it is not sustainable.
Same goes for the other way around. What if you're great marketer? If the product you're selling isn't good, customers won't buy again. You won't earn the ressources necessary to improve your product. Not sustainable either.
You can't be the best at both, but you need a decent skill level in both.

As illustrations, he mentions in his post some famous successful team that were made of a "Builder" and a "Seller" combo: Jobs and Wosniak, Gates and Allen, the usual CEO/CTO combo of any startups...
Then you have unstoppable people such as Elon Musk. People who can do both: build and sell. He is not good enough to design the whole rocket himself, but he is good enough to drive all key technical decisions. So he is Builder. He has an excellent business acumen too, which makes him a Seller too.

So builders should endeavour to become sellers and sellers to become builders?
Reality is a bit unfair.
Bill Gates said: “I’d rather teach an engineer marketing, than a marketer engineering.
A seller will have indeed a harder time learning to build than a builder learning to sell. Learning selling as an engineer can still be challenging. Depending on character and personality, builders have to figure out what they feel more confortable doing. What communication they're best suited for:

  • Person to person: recruiting, fund raising
  • Writing: blog, articles, tweets
  • Public speaking: make presentations, conferences, workshops
  • Talking: podcasts, videos
  • Photos

CODA: zk-SNARKS & recursive composition for a constant-size Blockchain

CODA logo
I actually introduced in my previous post what zero knowledge proofs are just for the sake of introducing this application I am very excited about: CODA.
CODA uses zk-SNARKS to build a tiny, constant-size blockchain. So tiny that it could run natively in browsers!

If you're still reading this after reading the very title of this post, I think I can also assume that you are familiar with the concepts of blockchain and zero knowledge proofs.

CODA is a new cryptocurrency protocol built on a "succinct blockchain". Succinct means here small in size and easy to verify.
It leverages zk-SNARKs to compress the blockchain down to a few Kilobytes, 22KB more exactly.
To put it in perspective, let's compare it to the 2 first blockchains in terms of adoption and market capitalization.
At the time of writing and according to bitinfocharts, the Bitcoin and the Ethereum blockchains are respectively 14 millions and 10.5 millions times bigger than CODA.

Blockchain Size /CODA
Bitcoin 308.38 GB - ever growing 1.4e7
Ethereum 233.00 GB - ever growing 1.06e6
CODA 22 KB - constant 1

One could argue that 200/300GB are still reasonable and affordable volumes of data.
One can buy a 500GB internal SSD SATA drive for 58€ on amazon. So a tech-savvy user is likely to have the ressources to set up a node on a personal computer and join the Bitcoin and Ethereum blockchains as a validating node.
However nowadays most of the computer are...mobile phones. There's no way they can deal with such big amounts of data.

Which is why CODA is so exciting. It reduces so much the technical requirements to run a node that mobile phones could become nodes and verify the whole blockchain. It would solve scalability, security and decentralization challenges.

Reminder

A zk proof proves that a statement is true without revealing any info beyond the validity of the statement itself.

When dealing with a blokchain, what could we possibly want to check?
...That the blockchain is valid! That the blocks are valid and correctly chained together.
What's the big deal?
Normally this validation is the job of validating nodes. Nodes that check whether the blockchain is still valid after submission of a new block. A member/user of the network can decide to trust people to validate for him (delegation) or can decide to perform the validation himself. In this case the user has to carry the costs associated with the validation process: remember the ever growing size of blockchains?

As long as you can be convinced that the blockchain is valid, you actually don't want to bother checking backwards the full blockchain history. You care only about a proof that the blockchain is valid. You don't care about the whole content...sounds like a good application for zk proofs, doesn't it?
This is precisely what CODA does: using zk SNARKs to certify that the blockchain has been validated correclty. Like a proof that an audit was executed properly.

Use of zk SNARKs in CODA

  • updating blockchain is just one computation
  • SNARKS can verify any computation
  • Processors produce SNARKs that certify they are updating the blockchain correctly
  • End users don't check the blockchain themselves, they check the tiny certificates instead

How could we use practically zk SNARKs as verification mechanism in a blockchain architecture?

Naive architecture: 1 SNARK for each block?

One could use them to produce a certificate which says: "I know a block which when applied to a data base of state 1 results in a data base of state 2. We can get from 1 to 2 with this block".
The end user receives:

  • certificate
  • a merkle path into Data Base (DB) of state 2 to check their balance without having to see the complete difference between DB1 and DB2.

What if it is not sure that DB1 is valid? We chain.
Just like the Bitcoin blockchain chains blocks together, we would chain certificates backward to check the validity of the whole chain: not chain of blocks but chain of certificates.

This would be an improvement in size buth the blockchain would still grow linearly.

Better architecture: 1 SNARK for the whole blockchain!

In the previous architecture, 1 SNARK consisted in a certificate attesting the validity of a block:
1 SNARK = 1 check = 1 proof that a block was computed correctly
The idea of composition is to check the checking process.

Recursive composition

Checking the validity of a SNARK is itself a computation and so itself can be certified with a SNARK.
So we check if all the past SNARKs have been computed correctly, which produces... 1 SNARK.

# snarkify(0, 1) = SNARK 1 that proves we can get from 1 to 2
# "snarkify(1, 2) = SNARK 2 that proves we can get from 1 to 2
# snarkify(snarkify(0, 1), snarkify(1, 2)) = SNARK 3 that proves we can get from 0 to 2

# and so on ...
0>1, 1>2 ----> 0>2, 2>3---> 0->3

End user

The download of this final SNARK is a sufficient proof of the blockchain validity: downloading 1KB SNARK ~ validating the whole chain.
In addition, to know his account the user has to download the merkle tree (22 KB).

Conclusion

Let that sink in,
a wallet implemeting the CODA protocol can get fully synchronised and achieve full security after the few milliseconds required to download the single SNARK.
It takes days/weeks to fully sync a node with the Ethereum or Bitcoin blockchains. The node has also to be kept up and running to avoid being left behind and become out of sync.
Everyone can become a validating node without any extra costs.

  • [x] Scalable
  • [x] Decentralized
  • [x] Secure

Perfect solution?
Not quite yet. One limitation of zk-SNARKs is that they rely on a "trusted setup". I haven't found out how CORDA deals with this yet.
zk-STARKs are alernative proofs to zk SNARKS that don't require such a trusted setup. Could we compose zk-STARKs recursively instead?

The video feat. CODA CTO Izaak Meckler that explained me what I've just shared is available here.

Blockchain privacy technologies serie: Introduction to zero knowledge proofs

In most of the situtations where you (Prover) make a statement and have to prove its authenticity to someone (Verifier), you'll actually have to disclose the actual value of the statement's argument.
Examples

Situation Statement Argument What is disclosed as a practical proof
Voting, Driving, Buying alcohol... I am over 18 yo Age ID card: actual age, birthdate, name
Execute a bank transfer or card transaction I have enough funds on my account Account balance Actual account value
Mathematical problem I know the solution Mathematical solution How to solve the problem
Private Key I am the owner of the private key Private key Actual private key value

Having to disclose the value of my account to anyone I want to buy something from, as a proof that I can actually pay him, is very problematic with regards to privacy and security.

Zero Knowledge Proofs

Zero knowledge proofs are precisely a mechanism to assert knowledge without divulging it.
It is a probabilistic-based verification (verification of the equality of 2 polynomial products by randomly selecting several checkpoints). The verifier asks the prover based on certain randomness. If the correct answer is given, the prover has a high probability of possessing what he claims to be “knowledge.”

Example: How to explain zk Proofs to your children - The Strange Alibaba Cave

As illustrated by the Binance Academy, imagine a ring-shaped cave with a single entry and a magic doorway that separates the two side paths apart. In order to go through the magic doorway, one needs to whisper the correct secret words. So consider that Alice (yellow) wants to prove to Bob (blue) that she knows what the secret words are - while still keeping them in secret.

  1. Bob waits outside. Alice enters the cave and walks until the end of one of the two possible paths. Alibaba Cave
  2. Bob walks by the entrance and shout which side he wants Alice to appear from. Open magic doorway
  3. If Alice truly knows the secret, she will reliably show up from the path Bob names. Alive shows up

Bob could think that Alice doesn't really know the secret words and was just lucky. (50% chance to choose the right side of the cave). To convince him they could repeat the operation. After n repetitions, the probability for Alice to luckily choose the right side has decreased to (1/2)^n. The bigger n, the more reliable the proof.
Hence the description of zero knowledge proofs as a probabilistic-based verification mechanism. zk Proofs aren't proofs in the mathematical sense because there is always a small probability (converging to 0) that a cheating prover may convince a verifier that a false statement is true.

Another famous example is the 3-colarability puzzle.

Properties

The combination of the 3 following properties defines more formally a zero-knowledge proof:

Soundness: cheaters get caught

A dishonest prover can't convince a verifier that a false statement is true.

Completeness: true statements get accepted

Following the protocol, an honest prover will naturally convince the verifier that a true statement is true.

Zero knowledgeness: true statements don't teach the verifier anything else except it being true

Applications

zk-SNARK

zk-SNARKs are a particular type of zk Proof

Zero Knowledge

Succinct

Proofs are smaller in size and quick to perform.

Non interactive

The basic of Zero-Knowledge Proof protocol is interactive. It requires the verifier to constantly ask a series of questions about the “knowledge” the prover possess. zk-SNARKs are non interactive in the sense that there is little to no interaction required between the Prover and the Verifier. The Prover can publish their proof in advance, and a verifier can ensure its correctness.

ARgument of Knowledge

= considered computationally sound (see soundness property).

Limitations

zk-SNARKs have limitations though. They are dependent on a trusted setup between the prover and the verifier. A set of public parameters is required to construct zero-knowledge proofs. This creates a potential centralization issue because the parameters are often formulated by a very small group. The initial setup phase is critical in preventing counterfeit spending because if someone had access to the randomness that generated the parameters, they could create false proofs that seemed valid to the verifier.

zk-STARKs were invented as an alternative zk proof mechanism to zk-SNARKs. One that doesn't require such an inital trusted setup.

zk-STARK: succinct-Transparent- ARgument of Knowlege

Transparent

zk-STARK proofs present a simpler structure in terms of cryptographic assumptions. However, this novel technology comes with the disadvantage of generating bigger proofs compared to zk-SNARKs.

Apology of Permissionless Blockchains

I don't believe that private blockchains will become global. They will be confined to national governmental use or to industry specific consortiums.
I'll try to demonstrate my position by examining the architecture choice that different users may take through the lens of the following human motivations.

Motivations

Acquire (greed) – Desire to collect physical objects as well as immaterial ones like power, status, and influence
Defend (fear) – Desire to protect ourselves and our property
Bond (belonging) – Desire to form relationships
Learn (curiosity) – Desire to satisfy our innate curiosity

Of course, other motivations exist such as Feel (escape – Desire for sensory stimulus and pleasure) but I don't think they are applicable to this reflexion on the architecture choice of a blockchain solution.

Users

Businesses & Enterprises
Entrepreneurs
Random people
Governments

Pre-analysis

User Greed(Acquire) Defend(fear) Bond(belonging) Learn(curiosity)
Businesses & Enterprises xx x
Entrepreneurs xx x xx
Common people x x x x
Governments xx xx
Total 6 4 2 3
  • Business & Enterprises: profit driven institutions (greed).
  • Entrepreneurs: also want to make money (greed). Ready to take risks. Want to innovate (learn).
  • People: as it may naturally vary between individuals, I scored equally all attributes.
  • Governments: especially interested in maintaining their power.

Permissioned or permissionless?

1. Greed: where is money to be made?

Public blockchains are to transacting-transaction what Internet is to communicating-information. Permissioned/private blockchains are to blockchain what intranet is to internet.
Do companies from a same industry exchange information and communicate over an industry-wide intranet?
They don't. They use the open world wide web. It is where all the services were, are and will be built.
Because open networks stimulate innovation. While restricted and closed ones hamper it. Real value of blockchains are in coordination and permissionless innovation. So I believe more entrepreneuship and therefore more value creation will be happen public blockchains.
Problem: they are slow --> see post to come about scalability solutions (dedicated post to come)

Permissionless > Permissioned

2. Fear: what provides users with the best protection against what they value?

What do users value? Who/what may they seek protection against?

User Values Protection against/Fear
Business, Enterprises, Entrepreneurs Assets, capital, cashflow, copyrights, licenses, specific knowledge, secrets (technologies, deals, ideas...) Competition, thieves, spies, data destruction, data loss, data modification, disclosure...
Governments Control, surveillance, coercion power; military secrets; justice; democracy; cultural heritage; art; land registry; monetary, media control Foreign states, fraud, injustice, spies, data destruction, data loss, data modification
Common people Personal assets, privacy, legacy, freedom, identity Thieves, censorship, arbitrary decisions, intrusive surveillance, spies, data destruction, data loss, data modification

Unsurprisingly, 3 famous blockchain attributes emerge as mechanisms to offer protection against the different threats just listed:

Privacy: people protecting themselves from surveillance, businesses protecting competitive secrets.

Some private blockchain solutions providers such as Corda claim that "permissionless blockchain platforms—in which all data is shared with all parties—are largely unsuited for businesses.".
This statement used to be true. But today some privacy solutions on public blockchains exist (dedicated post to come).

Governance

By governance I mean the ability to define or redefine the rules of the network.
Permisioned blockchains are the most straightforward architecture to fulfill this need, which is paramount for governments.
Indeed governance seems incompatible with the decentralization and immutability of public blockchains.
However public blockchain have been (more or less successfully) experimenting with on-chain governance and DAOs (dedicated post to come).

Security

(Dedicated post to come)
Permissionless = Permissioned # Diligence required!

3. Curiosity & Belonging: what's more open and inclusive?

Public blockchains are by design permisionless: open and inclusive. While permissioned ones are the opposite.
Anyone can join public blockchains. Which is obviously the preferred choice for curious people who want to freely exchange & transact or try & learn new things. Morevoer, innovation and entrepreneurship (see 1) will thrive on an open network whereas restrictions on a closed network will hamper it.
Permissionless > Permissioned

Conclusion

I only see two motivations that would drive the selection of a permissioned blockchain over a public one:

  • The fear of loosing a central control power that individuals, organizations or governments may possess.
  • The fear of data destruction/loss/modification due to security risks & features specific to public blockchain architecture (such as miners collusion if PoW is used, immutability i.e. incapacity to rewrite history...)

Bitcoin and Stock to Flow Model

Common sense tells us that scarce things are valuable or costly. Scarce things such as precious metals or antiques are especially costly because they are hard to create for anyone.

Nick Szabo calls this property of being costly to forge "unforgeable costliness". Unforgeable costliness provides value independently from 3rd parties. Monetary systems are based on objects that are naturally (precious metals) or artificially (fiat and accouting) unforgeably costly.

However we can't really pay with metal online. Worse, it is free to create bits online. To have digital money we need a technique to create bits online in a costly way.
This technique would ensure that the "bits" it produces will keep being scarce. These bits would become a suitable digital money.
No such technique had ever existed until Bitcoin: Bitcoin is a protocol - a "technique" - that produces at a high cost (electricity bill) bitcoins that can be used as digital money.

The relationship between unforgeable costliness and value can be demonstrated by the stock to flow model.
StockToFlow = SF = stock / flow = 1 / supply growth rate
Stock is the existing reserves at a given time. Flow is the yearly production, the yearly injection of new volumes of commodities.

SF value of different commodities

SF numbers

For a commodities to increase its SF is very hard. As soon as individuals stockpile them, the offer and demand equilibrium will break and prices will rise. This will incentivize people to produce more of it (e.g mine more palladium). Prices will fall again.

So this property - "unforgeable costliness" - is essential.

Bitcoin SF model

Bitcoin has a current supply of 18,1M coins and a supply of 0.7M/year.
SF = 18.7 / 0.7 = 26.7
This places Bitcoin gold and silver.

However the Bitcoin protocol is designed in such a way that:

  • its maximal supply is fixed at 21M
  • The supply of additional bitcoin is cut in half every ~4 years (every 210,000 blocks) Bitcoin Monetary Inflation

Is this model valid?

PlanB tested the hypothesis that scarcity, measured as SF, drives value.
For gold, silver and Bitcoin he:

  • collected historical supply data. In Bitcoin's case, he queried the Bitcoin blockchain to know the number of new blocks (thus new bitcoins) per month.
  • collected historical price data

Then he plotted SF vs market value (logarithmic axis).
SF vs log(market value)
There is a strong statistical relationship (R^2 ~ 95%) between SF and market value.

Can we use this model to predict Bitcoin future price?

Unlike gold, the evolution of Bitcoin's supply is known, because it is predefined by the protocol.
We then have a way to predict the evolution of Bitcoin's price.
Bitcoin price evolution
The model predicts a bitcoin market value of $1trn after next halving in May 2020, which translates in a bitcoin price of $55,000.

On-demand delivery services and meal kits for a more sustainable food industry?

Upon reading the "Fate of Food" article in the Imagine 2030 report from Deutsche Bank Research, I starting reconsidering my opinions on the food industry current problems and potential solutions.
I tend to feel bad when I indulge myself ordering a meal and I have always refused to make use of a meal-kit service such as HelloFresh.
Indeed,besides food quality and costs considerations, isn't it non sense from an ecology, energy or sustainability point of view?
More packaging, more transport costs, more energy consumption?
But what if I was wrong?

From supply-driven to demand-driven

Today the food chain is supply driven: the upstream producers push whatever they harvest down to the downstream customers.

  1. Farmers harvest crops
  2. Wholesalers buy from farmers
  3. Retailers buy from wholesalers
  4. You and me or restaurants buy from retailers

The problem is that especially wholesalers, but also retailers and end customers buy in the hope it will be eaten, ordered, or cooked in the coming week.
Otherwise the food is wasted.
All the apples or tomatoes you see on the shelves in your supermarket will actually be waste unless someone buys it or the supermaket donates their food surplus.

In a demand driven chain, the downstream customers pull only what they need from the upstream producers.
It could obviously reduce waste because only what is necessary would be bought or ordered. It requires much better planning though.
How can wholesalers or retailers reliably forecast the volume of food they need per month all over the year, taking into considerations seasons, customers trends...
This is where meal-kit services come into play. Because they collect tonnes of data on the order habits of its customers, they can define these necessary forecast demand models. Machine learning and artificial intelligence can optimize them even further.

  1. Customers order periodically meal-kits
  2. Meal-kit services can forecast the food supply they need and order accordingly accurate amounts from the retailers
  3. Retailers can forecast the food supply they need and order accordindly accurate amounts from the wholesalers
  4. Wholesalers can distribute better what they bought from farmers between wholesalers
  5. Farmers keep harvesting as much

HelloFresh claims its model cuts food wastage by up to 4/5 compared with a traditional retailer.

Incentivize customers to order meal kits?

The previous reasoning of a demand driven food chain fails if the customers don't order periodically meal-kits. The more frequent the meal-kit orders, the more reliable the forecast models.
It is actually a strong assumption because it involves an important customer's behaviour change. Am I ready to commit today to my future meals for the complete next week? I am forced to plan ahead. No more spontaneity.
How to incentivize customers: raising awareness on the positive impact it could have on food waste, price...?

Blockchain: what is it, what is it for?

For a technical definition, read here.
A non technical definition is: "a technology that enables digital exchanges of value between entities that don't trust each other without intermediary".
Why does it matter?

Internet: information network

If you are reading this post, I can safely bet you know you what Internet is. So you should be able to wrap your head around the idea of what a network is: Internet being something often referred as THE NETWORK.
One can be 'part of', 'on a', 'connected to' a network. Anyone on the Internet network can directly exchange digital information with another connecte member.
Services whose job precisely used to be to execute some form of exchange on behalf on a sender and recipient progressively became obsolete.

Pre Internet Post Internet Information exchanged
Communication Post office Emails Peer to peer message
News Newspaper Blog, twitter News
Music & Video Music labels and movies studio Streaming Released music
Hospitality Hotels Airbnb Avalaible rooms
Transport Taxi Über Avalaible taxi

In practice, that exchange is seldom disintermediated. Indeed it is more convenient to rely on streaming & hosting services or apps. However nothing technically requires doing so.
Internet = birth of a disintermediated information network.

Post office, newspaper, music and movies studios, hotels, taxis... anything missing on the list?
BANKS. Banks have kept being required because they don't only facilitate exchange of information but exchange of value.
I have often read or heard that "data is the new oil". As a data analyst I used to believe it myself. I would repeat it to people to give me importance. However since such data is exchanged digitally on Internet, this statement is totally bullshit.
Because: ctrl + c & ctrl + v.
Oil is physical. Have you ever ever copy pasted an oil baril? Therefore oil has value. Selling that baril, means the seller is losing ownership of that baril to the benefit of the buyer who pays him. Value flows both ways (asset and money).
Data isn't physical. Therefore, I can just hand over a copy created at no cost and sell again the file later. I don't loose ownerhsip of that file. The file's value hasn't been transferred from me to the buyer. Value flows one way (money only).

Buyer Seller
Money -->
Oil <--
Money -->
File

Other example: sending 1 mail to 1 person or sending 1 mail to 1,000 persons costs the same.
So, as such, data has zero fundamental & commercial value.

Double spending problem

This "copy pasting" of an immaterial asset leads to the so called double spending problem.
How can one prevent a double spend?

Ledger

One must keep track of the past exchanges performed. Double spends could then be detected by checking this history of exchanges. This "history of exchanges" is called a ledger.
Let's say I want to transact with a stranger I don't trust.
Who will write into the ledger?

  1. Myself: I can "omit" transactions so that my double spend don't appear and screw the stranger. He/she won't let that happen.
  2. Stranger: he/she can "omit" transactions so that his/her double spends don't appear and screw me. I won't let that happen.
  3. We both write in the same ledger: ledger will turn into an inconsistent and useless mess.
  4. We both write in our own ledger: two ledger's exist. Their content will be in conflict. Which one to believe?
  5. We ask a third person to write all the transactions in the ledger for us: we both hope for the best and decide to trust that person won't try to screw us.

1, 2, and 3 can't obviously work.

Centralized ledger

5 is the centralized way of solving that problem. We have been relying on them because banks provide a solution to this problem. They act as a trusted intermediary responsible to check whether an immaterial asset exchanged is only spent once. The very reasons of banks' existence is to make trusted exchanges between people that don't trust each other possible..
When Alice wants to wire money to Bob, she doesn't wire it herself from her account to Bob's. Here's what happens

  1. Alice wants to pay Bob.
  2. Bank checks her account balance
  3. Banks substract amount from her account balance
  4. Bank add that amount to Bob's account

The bank is in charge of keeping a big accounting book up to date.
The main issue with that solution is that you have to "hope for the best and decide to trust that banks won't try to screw you".

Decentralized ledger

This the approach described in 4: instead of trusting one single person to maintain the ledger, everybody keep and write in their own copy.
Several challenges need to be solved:

1. How do we agree on what is the current valid version of the ledger?

By distributing to everybody a copy of a the last version of the ledger which was considered as valid.

2. How do we ensure that what was written can't be changed or deleted?

  1. Bundle new records to be added to the ledger in pages
  2. "Mark uniquely" each page based on its content.
    This makes changing a page's content obvious. For instance imagine adding on the corner of a page the number of characters on the page. Adding or deleting character changes obviously that number. So cheaters are detected. The harder the creation of that mark, the more secure the ledger
  3. Link each new page to the previous one. Write on each page's corner its "unique mark" along with the mark of the previous page. This way if one want to "remark" a page, one has to remark all the following one to achieve the change without being noticed. This makes the ledger even more secure.

3. How do we agree on what to write next?

By intuition, a fair way would be to decide "democratically": the correct version is the one most people consider as valid.
How to count what is "most"?
We need a digital voting system. Especially we need to able to ensure nobody is voting twice. Otherwise our voting system is rigged. In the real world we prevent people from voting twice by checking who they are before letting them vote.

On internet nobody knows you are a dog

Don't forget we are "on Internet", which means we can't identify people. Websites may require to authenticate yourself but "Internet protocols do not force users to identify themselves". There are ways to identify people on Internet. It would always end in having some form ID/certificate providers but that would grant internet users with digital IDs. These ID providers would become the "trusted" 3rd party, in charge of maintaining the ledger of all IDs. We would be back to the centralized scenario I refuse to follow.

In a digital voting system, a same person could impersonate lot of different "digital profiles". Using email account as digital ID to count votes? One same physical person could create at no costs 1000 different mail accounts to vote 1000 times.

Voting costs

As we can't identify people, we need a way to dissuade them to vote twice: voting must become costly. It needs to take time or cost money.
But if voting is expensive, why will people vote in the first place?
One solution is to add to the voting system a lottery system:

  • make expensive to be granted the right to vote
  • reward randomly one of the voters: the winner get rewarded economically and get the right to add a new entry in the ledger

Application: Bitcoin

This brilliant combination of voting and lottery system is actually at the core of the Bitcoin Blockchain.

Voter Miner
Voting costs Mining: solving a mathematical challenge ("Proof of Work") that can only be resolved by brute force by spending computer ressources (the "unique mark" on each page of the ledger)
Ledger Bitcoin Blockchain
Content of the ledger Transaction = exchanges of Bitcoin
Consensus The longest ledger is the correct one (the one most people spent ressources to build!

Blocks are simply a group of transactions. Bundling transactions in blocks makes it easier to check the validity of a ledger. Similarly, it is easier to review the content of a long text, when that text is structured in different pages of a book.

Blockchain: disintermediated digital exchange of value

Let's look back at the initial definition: "a technology that enables digital exchanges of value between entities that don't trust each other without intermediary".
Assuming we want to exchange value: making money transfers over the internet, sell or buy digital assets (digital pictures, music, certificates, loyalty points...), trust that double spending can't happen is required.
Ensuring this required trust can be achieved in:

  • a centralized way by relying on a third party (banks, escrow, notary). It comes with costs and risks: will the third party behave in their or your best interest?.
  • a decentralized way thanks to Blockchain technology. It also comes with drawbacks, such as transaction speed.

Blockchain Technical Definitions

Asset

Anything that has value to a stakeholder.

Block

Data structure comprising a block header and block data.

Blockchain

Specific type of DLT.
Database which is:

Blockchains are designed to be tamper resistant and to create final, definitive and immutable ledger records.

Block data

Data structure comprising zero or more transaction records or references to transaction records.

Block header

Data structure that includes a cryptographic link to the previous block.

Confirmed

Accepted by consensus for inclusion in a distributed ledger.

Consensus

Agreement among nodes that:

  1. a transaction is validated
  2. the distributed ledger contains a consistent set and ordering of validated transactions

Consensus does not necessarily mean that all nodes agree.
The details regarding consensus differ between blockchain designs and this is one key distinguishing characteristic between one design and another.

Consensus mechanism

Rules and procedures by which consensus is reached.

Cryptographic hash function

Function mapping binary strings of arbitrary length to binary strings of fixed length, such that it is computationally costly to find for a given output an input which maps to the output, and it is computationally infeasible to find for a given input a second input that maps to the same output
Computational feasibility depends on the specific security requirements and environment.

Cryptographic link

Reference, constructed using a cryptographic hash function technique, that points to data.
A cryptographic link is used in the block header to reference the previous block in order to create the append-only, sequential chain that forms a blockchain.

Distributed Ledger (also called distributed ledger technology: DLT)

Ledger that is shared across a set of nodes and synchronized between the nodes using a consensus mechanism.

Immutability

Property wherein ledger records cannot be modified or removed once added ("append-only") to a distributed ledger.
Where appropriate, immutability also presumes keeping intact the order of ledger records and the links between the ledger records.

Node

Device or process that participates in a network and stores a complete or partial replica of the ledger records.

Ledger

Information store that keeps records of transactions that are intended to be final, definitive and immutable

Ledger record

Record comprising hashes of transaction records or references to transaction records recorded on a blockchain or distributed ledger system.

Public Key

Key of an entity's asymmetric key pair which can be made public.

Private key

Key of an entity's asymmetric key pair that is kept secret and which should only be used by that entity.

Record

Information created, received and maintained as evidence and as an asset by an organization or person, in pursuit of legal obligations or in the transaction of business.
Applies to information in any medium, form or format.

Transaction

Smallest unit of a work process, which is one or more sequences of actions required to produce an outcome that complies with governing rules.
Where appropriate, transaction is understood more narrowly, as the smallest unit of a work process related to interactions with blockchain or distributed ledgers.

Transaction record

Record documenting a transaction of any type.
Transaction records can be included in, or referred to, in a ledger record.
Transaction records can include the result of a transaction.

Validated

Status of an item when its required integrity conditions have been checked.
A transaction, ledger record or a block can be validated.

Wallet

Application used to generate, manage, store or use private and public keys.

Night sky effect

This TED talk from Aaswath Raman really puzzled me and made so optimistic and hopeful for our planet's future.

Issue

The alarming starting point is the vicious circle of cooling we are currently stucked in:

  1. We cool:
    • To live and sleep comfortably in places where the heat can become unbearable.
    • To keep our food longer
    • To operate data centers
  2. The warmer it gets, the more we need to cool.
  3. Back to --> 1: vicious feedback loop.

Cooling energy counts to 17% of global electrical use and 8% of greenhouse gas (GHG) emissions. Demand may increase up to 6 times by 2050. Cooling systems may become the biggest contributors to GHG and electricity "consumers".

Solution recipe: Night Sky Effect = target atmosphere transmission window to benefit from space coldness and build a negative heat balance sheet thanks to thermal radiation.

Night Sky Effect is a natural phenomenon. This is how ice is made at night in the desert even though the air is warmer than 0° C.
How does it work?

1. Fourier's Law: heat follows negative temperature gradient

q =-k∇T

This law says that local heat flux densityqis equal to the product of thermal conductivitykand the negative local temperature gradient∇T.

In layman's terms it says that heat "flows" towards colder places. This law explains us a first thing: because the sky/space is cooler than the Earth. The heat "flows" from the Earth to Space.
What does "flowing" actually means?

2. Thermal radiation: heat "flows" as light

When matter particles get warmer, they start moving very fast at microscopic level (thermal motion). By doing so they also generate electromagnetic radiation = light.
This phenomenon can be visualized with night vision/thermal googles. Another evidence of this phenomenon is when a material changes color when it gets hot, like a piece a coal that gets orange/red.
So we have heat that flows from hot places (Earth) to cooler places (Space).

3. Absorption & Greenhouse Effect

Unfortunately, in the same way particles can emit energy as electric radiation, they can also "absorb" light and generate heat back!
Especially we say that some gases in the atmosphere "absorb" some of the heat that the Earth tries to "radiate" towards Space: we call these gases Green House Gas (GHG) and this effect the greenhouse effect.

4. Infrared Window

Luckily not all that heat is absorbed and reflected back! Otherwise the Earth would be much warmer than it is.
Light is made up of a range of different wavelengths: the electromagnetic spectrum. There are infrared, ultraviolet, X-Rays...
The atmosphere doesn't react the same way to all wavelengths. Especially the wavelengths between ~[8 μm, 30 μm] aren't reflected back: this is called the infrared transmission window.
So if an object emits its heat within that specific transmission window, we guarantee that its heat will go completely through the atmosphere: the object will get cooler! This how we can turn water into ice at night in the desert.

Why don't we already make use of this phenomenon to cool everything?! Because it is called NIGHT Sky Effect.
During the day, the sun heats all the objects (Earth) that we may want to cool so much that the overall heat balance-sheet gets positive again. The night sky effect is not strong enough to counter balance heating from sunlight during the day.

5. Nanophotonics

Wouldn't be cool to benefit from the Night Sky cooling Effect during the day?!
For this we need to "target" the infrared transmission window of the atmosphere.
Thanks to nanophotonics it is actually possible to design materials that radiate their heat precisely at the wavelengths that are best let out by the atmosphere.
It is like engineering a heat mirror: something that gets cooler when they receive sunlight! Or very counterintuitively, something that get cooler when it gets out of the shadow!

Applications

Manufacturing techniques to build these materials already exist. This is also what Aaswath Raman explains in his TED talk.

Cooling panels

Such materials can be used to build "cooling panels" that are placed in sunlight. It can already increases the efficiency of cooling systems by 12%. In the future, cooling systems may require no electricity at all!

Integration with solar cells.

Solar cells get less efficient when they get hotter. So by integrating such materials into solar panels, we can improve their efficiency.

Heat engines: "generate light from cold darkness of space".

One can imagine using the temperature delta between Earth and Space to generate electricity! Or generating electricity when solar panels can't work.

Bitcoin mining

Kickstarterreum: Kickstarter on Ethereum

Kickstarter helps artists, musicians, filmmakers,
designers, and other creators find the resources and support they need to make
their ideas a reality. Potential future customers, 'backers', can contribute to a project to finance the development of a product.

Sounds awesome.

However, a fundamental problem for crowdfunding is how asymmetrical the risks faced by backers and founders are.

After having invested, backers don't get a say on how their money will actually be spent.
Worse: which guarantee do they have that the founder they backed, will actually
deliver what they promised, and not go away which the funds successfully collected from the backers?

Ethereum & smart contracts are a great solution to come around these issues. Indeed, the collecting of funds and spending of collected funds gets automated in a secure and a decentralized way.

The risks previously faced by backers disappear because they get to vote on how funds are spend.

The smart contract application I deployed on the Ethereum Rinkeby test network fulfills the following:

  • [x] A founder can create a new campaign. He/she sets the minimum contribution amount for future backers.
  • [x] Anyone can back a created campaign, provided they contribute at least the minimum amount set by the founder.
  • [x] The smart contract controls the funds. Neither the founder nor the backers are able to take out or spend funds collected by the campaign
  • [x] Only the founder can create payment requests. He requests the backers to agree on how to spend the campaign's funds. He specifies the payment's recipient.
  • [x] Backers can approve (1 time each) payment requests
  • [x] The founder can finalize a payment request that has been approved by a majority of backers. This automatically executes the payment (transfers amount to recipient)

Github repository
Demo video

Bayerische Landesbank about Bitcoin and “hard” money

The Bayern LB is a german state-owned bank.
Their research unit recently published a report titled Megatrend Digitalisierung.

Highlights

  • "an asset with a high stock to flow ratio (like gold) is said to be hard"
  • "Historically, assets with the highest stock to flow ratio have always been always used as money, as it enabled the best value transfer over time"
  • "Bitcoin was engineered to be harder than gold".

eventually (better late than never)...they notice the strong correlation of Bitcoin's stock-to-flow ratio with its market capitalization. Their model predicts a price of $90,000 following the halving in May 2020.

Yes, large-scale data can be stored on the Ethereum blockchain

WePower/Elering Nationwide Energy Experiment

Hinsights from WePower website:
WePower’s mission is to enable everyone to make a change towards a sustainable energy future through their energy purchasing decisions. WePower essentially tackles the limitations of the current instruments that exist in the market, especially Power Purchase Agreements (PPAs), which are too complex and expensive for corporate energy buyers. That's why WePower has built a next-generation renewable energy procurement and trading platform based on smart contracts that uses virtual PPAs.

...so we have #blockchain for #P2P transactions and #transparency in the energy market, #SmartContracts for cheaper, faster, transactions... and on top a dedicated #token.
Sounded like nothing new or worth further reading. Not the first time people claims revolutionizing some market thanks to Blockchain.
Until I read about the test carried WePower carried out. They don't just throw out buzz words. They did perform a real test: trying to write on the Ethereum blockchain energy production and consumtion data on a national scale.

And they did it succesfully.

Scope

1 year of actual production and consumption data was provided by the estonian energy operator Elering

Volume of data points

households 7x10⁵
timeframe 1 year
readings 1/hour/household

7x10⁵ x 365 x 24 = 6.132x10⁹

Time

WePower assumed they could store 200 data points per block.

Total volume 6.132x10⁹
Assumed volume per block 200
Seconds per block 15

6.132x10⁹ * 15 / 2x10² = 4.599x10⁸ s ~ 5323 days ~ 15 years

Costs

Based on the Ethereum yellowpaper (page 20)

Store operation 2x10⁴ gas cost
Every Transaction 2.1x10⁴ gas cost
Average Gas Price 8.5 Gwei
1 ETH 1x10¹⁸ wei
1 ETH 185.68 € (coinmarketcap, 13th August 2019

6.132x10⁹ x (2x10⁴ + 2.1x10⁴ / 200) * 8.5x10⁹ = 1.04791281x10²⁴ = 104,791.281 ETH = 195 M€

--> WePower needs 14 years and 195 M€ to run their pilot/test... GAME OVER? Not quite yet.

Optimization

To cut down testing costs and time, they:

  • compressed data
    • EVM uses 32 bytes per words. Or numbers represented by 256 bit integers, which was more than required for one point. They figured they could fit 15 data points per 256 bits. Meaning 15 times less gas costs and 15 times less validation time.
  • reduced the volume of data by:
    • aggregating at zip code level --> 3837 points (also added benefit of anonymizing data and complying with GDPR regulations)
    • summarizing hourly consumption into monthly consumption --> 3837 x 12 = 46,044 data points

Test results

Stats

Total Transactions 434
Gas Used 1,129,171,462
Average gas price (Gwei) 9.65921659
Median gas price (Gwei) 2.5
Cheapest gas price (Gwei) 1.5
Highest gas price (Gwei) 60
Average wait time (s) 1,456.5
Median wait time (s) 203.6
Shortest wait time (s) 0.8
Longest wait time (s) 44,882.1

Takeaways

Higher gas price doesn't necessarily means shorter confirmation times

Chart - Median time to transaction confirmation per gas price

With the intention of reducing testing time and costs, they were batching transactions by 15 per block. This actually led to longer waiting times when "gas price increased while transaction were still waiting to be confirmed. Other transactions from the batch had to wait in line until the gas price would meet again the set price level". (see transactions at gas price 4 Gwei).
"Performance depends on other activities being performed at the same time on the blockchain, meaning that increasing gas price does not always guarantee timely confirmation of transactions."
Speed was achieved in being reactive, meaning readjusting the set gas price frequently enough to avoid being below the market price. Practically, they ended readjusting the gas price every 3 transaction using ETH Gas Station API

Gas price equal or lower than the average secures confirmation time lower than 5 minutes

Chart - Distribution of transactions per gas price strategy
  • Safe low: both cheap and successful. Lowest price where at least 5% of the network hash power will accept it.
  • Average: accepted by top miners who account for at least 50% of the blocks mined- safe and prompt. Usually reflects wallets' defaults.
  • Fastest: lowest gas price accepted by all top miners. Should secure transactions will be accepted by all the top pools. Paying more than this price is unlikely to increase transaction confirmation time under normal circumstances.
Chart - Transaction confirmation time distribution

Reading both charts, we see that while ~85% transactions were submitted using an "average" or worse (safelow) price, ~60% transactions were still confirmed in less than 5 minutes.

For the energy market, my personal guess is that this confirmation time is satisfactory.

Conclusion

This test was successful both for WePower and the Ethereum network.
One one hand the pilot "helped WePower to validate and verify the logic and processes that will be at the core of the WePower platform.": they can go forward developing their platform with confidence.
On the other hand their test confirmed that "Ethereum is mature enough to accommodate contracts with multi-year terms." WePower ends their report acknowledging "that the scalability of Ethereum blockchain currently has limitations" but also reminds that "the problem is being tackled by Ethereum developers with plans to implement sharding, i.e. partitioning data into subsets, and moving from energy-intensive Proof of Work to a more environmentally friendly Proof of Stake consensus model".

Data visualisation principles

The motivation to write these lines came from seeing both bad/good visualizations, both at work or in the medias. So here are some principles that should help build better visualizations or detect flaws in poorly designed visualizations. I'll start assuming you have some data (symbols, signs, bytes, characters….) that you interpreted to get some information out of it. You now want to communicate it and present it to others. One effective communication is often visual communication. Back to data, you have basically two options: tables or graphs.

Graph or table?

First of all ask yourself if you really need a chart. Unlike data tables, graphs are not meant to provide precise quantitative values. Graphs reveal patterns, trends, relationships and exceptions that would be difficult to discern from a table of values.
Sometimes the best graph is no graph.


Visual attributes

Let's now assume you found out you do need a graph. It's good to know some things about visual perception from a physiological perspective to understand what works and what doesn't. Your eyes are able to detect a limited set of visual attributes (e.g color, shape, size….). Due do pre-attentive processing some of these visual attribute are perceived extremely fast without any conscious effort. Why should you care? Because you want to visually encode your information so that it is perceived instantly and easily. Here are some pre attentive visuals attributes, from the most to the least "accurately perceived":
1.Position
2.Length
3.Angle/Slope
4.Area
5.Volume
6.Color Hue/Density
pre_attentivve_visual_attributes

  • Position and length being better perceived, they are better suited for encoding quantitative data --> how much?
  • Colors or shapes are better suited for encoding categorical data --> what?

Which type of graph?

Ask yourself what do you want to show.

Purpose Graph type
Comparison Between items: bar charts, over time: line charts
Distribution Histograms
Relationship scatter chart
Composition stacked bars chart, waterfall chart

This document provides useful help when it comes to choosing the right chart's type.


Best pratices

Here are some recommendations before finally building your graph:

Save Pies for dessert

Although Pies are good to show part-of-whole relationship, pies use areas as a visual encoding which is not so accurate. And it also often requires using redundantly colors to distinguish values.
Prefer bar charts over pies!
On which chart is it honestly easier/faster to read/order/get sense of the data values without having to explicitly label them?
save_pie_for_dessert

Colors

Use different colors only when they correspond to differences of meaning in the data. In the example above color was actually redundant for the bar charts.

Colors are appropriate to show:

  • categories (different values per item)
  • sequence
  • divergence color palettes Besides these cases, it is very likely that using color (or more than one) is redundant. It should not be carnival on you chart. You are doing data visualization, which is about understanding: an effective chart may look "boring". You are not doing data art, which is about entertaining.

Data look better naked

Remove from your graphic all the ink/pixels that are not related to the numbers/values you actually want to represent (concept of maximizing the data-ratio from EdwardTufte). This includes removing: background, frames, axis, shadow/3D effects, gridlines...
You should grasp the idea looking at this animation:

animation

Avoid not making your vertical axis start at 0

It is confusing and may convey a wrong message.

On this first chart, it looks like Germany has a big edge over countries like France or Italy.
misleading vertical axis
While actually...
ok vertical axis

Credit & further reading: most of the concepts I have just summed up come from www.perceptualedge.com.

DietPi Home Cloud Server

Block ads and access your data everywhere: self-hosted DNS+VPN+FTP+CLOUD server

I used to rely on cloud services offered by 'powerful, centralized, privately-owned companies' to store and share data between my personal devices. Not happy with their valuing of privacy, I decided to host myself a server. It should fulfill the following requirements:

  • [ ] 'network firewall' or 'DNS sinkhole' to block ads and trackers.
  • [ ] file server (ftp)
  • [ ] cloud server (http)
  • [ ] store data on a separate drive
  • [ ] accessible on the go
  • [ ] rely as much as possible on open source products
  • [ ] low cost
  • [ ] headless: no keyboard, mouse or screen, controlled remotely via ssh connection
  • [ ] secure

...a DNS+FTP+CLOUD+VPN server.

1. The Single Board Computer: Raspberry Pi 3B+

The Raspberry Pi is the name of a popular series of single board computer made by the eponymous Foundation. They provide low-cost (35$, high-performance coputer, outreach and education to help more people access computing and digital making.

The Raspberry Pi operates in the open source ecosystem: it runs Linux and its schematics are released (board itself is not open hardware though).
Costs: 55.39€ (board + case + power supply + SD card)

  • [x] open source
  • [x] low cost

2. The OS: DietPi: Raspberry Pi on diet

DietPi describes itself as lightweight justice for your single board computer. It is an extremely lightweight Debian based OS. Think of a stripped version of 'Raspbian lite'.

It moreover offers a catalogue of popular 'ready to use' and optimized softwares (desktop, media, ssh, cloud, web/file servers...).
So it is optimized for minimal CPU and RAM usage and includes pimped versions of the softwares I plan to use. DietPi sounds like the perfect OS for my RaspberryPi.

Installation

  1. Flash SD Card with latest version of DietPi using Etcher

Optional: Pre configure dietpi for wifi

Locate and edit dietpi-wifi.txt:

aWIFI_SSID[0]='MySSID'`, `aWIFI_KEY[0]='MyWifiKey'
  1. Check Router interface to find IP of raspberry or use nmap: e.g nmap -sP '192.168.178.*'
  2. Connect via SSH to rasberry PI: ssh root@i.p.add.ress.
    • Standard password: dietpi
  3. [x] headless

  4. Go through throught the installation

  5. Set up static IP address (required for pi-hole to work):
    Dietpi Config > 7: Network options: adapters: select your adapter > change DHCP setting to static and apply


2. The DNS server: Pi-hole

Pi-hole describes itself as a black hole for Internet advertisements.

Pi-hole basically blocks queries using lists of blaclisted hostnames. Acting as a DNS server makes it an ad blocking application much more powerful than e.g brower plugins:

  • All your home devices (including smart TV) benefit from the network-level blocking. Especially in blocks
  • Network-level blocking allows to block ads in non-traditional places such as in-apps ads

Installation

  1. dietpi-software > Pi-hole
    • Select upstream DNS provider > Custom: 46.182.19.48 (digitalcourage.de), 80.241.218.68 (dismail.de)
    • Select default for all other configuration options
  2. Automatic reboot. Relog.
  3. Configure your router: add Raspberry Pi IP as local DNS server
  4. Redefine pihole admin password: pihole -a -p
  5. Last settings:
    • Log to http://diepi.ip.address/admin
    • Settings > DNS
      • Interface listening behaviour: should be "interface tun0"
      • Advanced DNS settings:
        • [x] Never forward non-FQDNs
        • [x] Never forward reverse lookups for private IP ranges
        • Conditional Forwarding
          • [x] Use conditional forwarding: provide your router's IP and domain name

Automatic updates

Edit sudo nano /etc/cron.d/pihole. Add at the end:

# Pi-hole: Auto-Update Pi-hole!
30 2    * * 7    root    PATH="$PATH:/usr/local/bin/" pihole updatePihole

Note: it may be necessary that you reboot your devices before they actually start using the pi-hole DNS server and that their queries get blocked.


3. The storage: mount a usb drive

  1. Plug your usb drive into the raspberry pi
  2. dietpi-software
    • User Data Location >Drive: Launch Dietpi-Drive_Manager
    • Select drive
    • Ensure it is formatted as ext4. If not use the dietpi formatting feature.
    • Mount and rename
    • [x] User data: Select to transfer DietPi user data to this drive
    • Exit

Check in dietpi-software that 'User Data Location' now indicates: mnt/yourdrive/dietpi_userdata

  • [x] store data on a separate drive

4. The cloud server: Nextcloud

  1. dietpi-software > software optmised > 114 Nextcloud
  2. Check access
  3. Add the hostname set for your RaspBerry Pi (I personally use dynv6 as a provider) and/or your static IP address to the list of trusted domains:
    Edit /var/www/nextcloud/config/config.php

    'trusted_domains =>
    array (
    0 => 'rasp.berry.pi.ip',
    1 => 'new.dom.ain.ip'
    )
    
    1. Increase max upload and php memory size Edit /etc/php/7.3/cli/php.ini and /etc/php/7.3/fpm/php.ini and increase post_max_size, upload_max_size, memory_size

5. The FTP server: ProFTP

  1. dietpi-software > File Server > ProFTP
  2. go to ftp://username:pwd@your.raspberrypi.ip.address (port 21)

Change the destination directory

Replace /Path/To/Directory to your target directory.

systemctl stop proftpd
sed -i '/DefaultRoot /c\DefaultRoot /Path/To/Directory' /etc/proftpd/proftpd.conf
systemctl start proftpd

Enable "jailing" (lock users to their home folders)

systemctl stop proftpd
sed -i "/DefaultRoot /c\DefaultRoot ~" /etc/proftpd/proftpd.conf
systemctl restart proftpd
  • [x] FTP server
  • [x] open source (GPL licensed)

6. The VPN server: openVPN

After setting a VPN we will benefit from:

  • access to pi-hole on any of your connected devices even outside of your home LAN
  • more security as your connection will be encrypted ("tunnelled") while on e.g a public wi-fi network
  1. Get a hostname for your dynamic (router) IPv4 address (I personally use dynv6 as a provider).
  2. dietpi-software > PiVPN
  3. Use dietpi user
  4. Local DNS: enter domain of your dynamic DNS address: this will secure that your client can connect to your piVPN server even after an IP address change. Your router will have to be configure accordingly too (see further below).
  5. Change default port for more security: ex 3456
  6. DNS Provider for VPN clients: custom > address: 10.8.0.1
  7. No custom search domain
  8. Accept other default options
  9. Reboot and relog

Now we want to define the IP address of the VPN interface (tun0) as the DNS server for the VPN clients. That way we reroute all DNS queries of the clients to our local DNS server, which is pi-hole!

  1. nano /etc/openvpn/server.conf
  2. comment out push "block-outside-dns" (windows specific)
    • Check line push "dhcp-option xxx". Should be: push "dhcp-option DNS 10.8.0.1" If something else is defined, delete/comment out/replace.

Finally the dnsmasq configuration must be extended so that Pi-Hole allows DNS name resolution for the IP address of the VPN interface.

  1. nano /etc/dnsmasq.d/02-pivpn.conf Write line: interface=tun0
  2. nano /etc/pihole/setupVars.conf. Add line: PIHOLE_INTERFACE=tun0
  3. Enable IP forwarding
    • sudo nano /etc/sysctl.d/01-ip_forward.conf: add line net.ipv4.ip_forward=1
  4. Restart services
    • /etc/init.d/openvpn restart
    • /etc/init.d/pihole-FTL restart

Configure router:

  • Set the dynDNS settings
  • Forward port defined for the VPN sever (UDP) to secure that data packets from outside can reach it

Connect client

  • Add user: pivpn add
  • Copy .ovpn config file to client (e.g using proFTP)
  • Set up client with this config file
    Start VPN session on linux
    sudo openvpn --config path/to/.ovpn file

  • [x] VPN server

  • [x] open source

  • [x] accessible on the go


7. Security

  1. Change ssh port and forbid root login:
    • Edit sudo nano /etc/default/dropbear DROPBEAR_EXTRA_ARGS="-w -g" DROPBEAR_PORT=2200
    • service dropbear restart
  2. Exit
  3. Copy public key to Raspberry Pi to avoid entering ssh password every time: ssh-copy-id <USERNAME>@<IP-ADDRESS>
  4. Relog with new user: ssh username@i.p.add.ress -p 2200
  5. Install Fail2Ban: dietpi-software > Fail2Ban
  6. Enable HTTPS
    • dietpi-software > CertBot
    • certbot -d your.domain --manual --preferred-challenges dns certonly
    • Follow instructions and deploy the DNS TXT record _acme-challenge.... and its value
    • Renewal: for the moment manual --> to be improved
  • [x] secure

Conclusion

  • [x] 'network firewall' or 'DNS sinkhole' to block ads and trackers.
  • [x] file server (ftp)
  • [x] cloud server (http)
  • [x] store data on a separate drive
  • [x] accessible on the go
  • [x] rely as much as possible on open source products
  • [x] low cost
  • [x] headless: no keyboard, mouse or screen, controlled remotely via ssh connection
  • [x] secure