Author

Topic: what is CTK ? (Read 177 times)

newbie
Activity: 4
Merit: 0
March 11, 2020, 02:27:11 PM
#5
The Code Token System’s on-chain data storage components mainly implement the function of data chain. The main modules are as follows:
Data awareness -- data acquisition -- data cleaning -- data labelling -- data transmission -- data storage -- consensus accounting

 The above the critical technologies and application modules constitute the on-chain data storage components, including data cleaning and data labeling is based on the industry application modules, this system provides the SDK, to facilitate users to the application template for data cleaning and labeling, the subsequent developers could add another new  industry application modules, to contribute Code Token System.

1.1Data awareness

As we all know, there are two kinds of data sources, one is the data generated by software, the other is the data perceived by equipment, such as the data perceived by camera, the data perceived by sensor and so on.

 Data perception, which is based on the Internet of things technology, wireless sensor network technology and mobile Internet technology. It can sense all the data of the physical world, including images, locations, states, two-dimensional data, etc. These data are widely used in life and work. Exists around us and is actively or passively transmitted to sensing devices and networks. Data perception technology mainly includes RFID, sensors, two-dimensional code, infrared sensing, GPS positioning, sound and visual recognition, biometric recognition, etc., mainly collecting data from products, logistics, transportation, environment, infrared imaging, audio and video, location, biometric (facial recognition and DNA) and so on.
Data-aware devices mainly include Internet of things devices, smart electronic devices, smart operating terminals, smart household appliances, etc., as well as industrial machines and AI.

1.2 Data acquisition

Data acquisition mainly comes from the software and the sensing equipment. The data from software is mainly collected through the SDK and other middleware. Currently there are main application data classification, including SQL, NO SQL, Big data, App and other system Data, network Data and user terminal data.
The data of the sensing device is mainly collected through the sensing network. Sensor network mainly includes Internet of things, mobile Internet, wireless sensor network, etc. Sensor network has a unified protocol basis, just like the Internet needs TCP/IP. At the core level, because the sensor network is an extension of the Internet, it is also based on TCP/IP; At the access level, there are many types of protocols, which are basically composed of the following three parts:
 Intranet :RFID、NB-IoT、LORA、eMTC、Zigbee、Bluetooth;
 Extranet:Wi-Fi、2G、3G/4G、5G、LTE;
 Network control:TSN、SDN、NFV etc.。

1.3 Data cleaning
Data cleaning refers to the removal of redundant, invalid, and non-compliant data. Data cleaning involves two steps. The first step is to remove dirty data that is not available, such as missing or invalid fields in structured data, corrupted pictures in picture data, or converting special picture file formats to standard bitmap formats. After the first step of data cleaning, all data is available, but some data is not meet the requirements in terms of quality. The second step is to remove data that does not meet the data quality requirements. For example, the data in the signal data that does not meet the specific signal-to-noise ratio or the data in the person image that is not person.

In the first step, it is easier to identify industry standards, but it is difficult to quantify to process quality for the second steps.  When uploading enterprise data, it is necessary to indicate whether the degree and quality of data cleaning meet the standard . When other enterprises use data, if there is ambiguity on the result of data cleaning, they can conduct alliance arbitration for specific data without exposing the whole data, to determine the validity of the data.

1.4 Data Labelling

Data labelling is an application module based on the industry. The core is to annotate the collected data to be processed by application components. Because different industries and applications have different requirements for data annotation, this system only provides SDK, which is convenient for industry users to establish custom application modules for data annotation.

The elements of data labeling include: data category, collection time, object characteristics (such as men and women / age), collection area, data ownership (collection person), etc.
The form of data annotation includes picture annotation, voice annotation, text annotation, video annotation, road annotation, pedestrian annotation, image semantic segmentation and so on.
The method of data annotation can be manual or AI. Enterprises can create their own data labeling templates, and after deep learning, AI can automatically label them. Similarly, the quality of data needs to be indicated when the company uploads the data.

1.5 Data transmission

Data is transmitted by TCP/IP protocol. The TCP/IP transmission protocol, also known as the transmission control/network protocol, is also known as the network communication protocol. TCP/IP protocol is the most basic protocol on the Internet, in which the main protocols of the application layer are Telnet, FTP, SMTP, etc., which is used to receive data from the transfer layer or transfer data to the transfer layer according to different application requirements and methods.  The main protocols of the transport layer are UDP and TCP, which are the channels for users to use the platform and the internal data of the computer information network to implement data transmission and data sharing.  The main protocols of the network layer are ICMP, IP and IGMP, which are mainly responsible for the transmission of packets in the network. The network access layer, also known as the network interface layer or the data link layer, the main protocol has ARP, RARP, the main function is to provide link management error detection, effective processing problem of information details against different communication media  , etc.

1.6 Data storage

After the data is cleaned and labelled, the original data is encrypted once and attached the context data related to the data (such as the quality of labeling and cleaning) ,and uploaded to network for IPFS storage. After IPFS storage, a Hash is generated which well be recorded on the chain.
After receiving the data, the storage node needs to stay online for a long time to prove the validity of file storage with zero-knowledge proof. Zero-knowledge proof means that the certifier (the file store) proves to the verifier (other nodes) that he knows or owns a file, but does not disclose any information about the file during the proof process. A simple document for zero knowledge proof process is as follows:
Every once in a while, other nodes with the file ask the file depositary whether the random Hash is the same as the one they calculated. The main storage verification algorithms for distributed storage are Provable Data Possession [1] and proof-of-retrievability [2], that is, to prove that the storage node stores the data, and then to prove that the data can be retrieved by other nodes.

1.7 Consensus Mechanism

Code Token System bookkeeping consensus adopts the PBFT algorithm [3], and the whole architecture is based on IBM Hyperledger Fabric. The consensus on file storage is achieved through Provable Data Possession and proof-of-retrievability. The governance consensus is mainly carried out through the mechanism of on-chain voting, which will be described in the second section.

1.8 application scenarios


We take an application scenario as an example to introduce the process of on-chain data storage components:
1) data awareness and acquisition: enterprise A is a logistics enterprise, which collects the product logistics information and environmental information.
2) data cleaning and labeling: enterprise A cleans and filters the collected data according to its own industry requirements, and then uses AI technology to label the data.
Note that data cleaning and labeling can be considered as industry application modules and established by the enterprise itself.
3) data transmission, IPFS storage and consensus bookkeeping: enterprise A will upload the data to IPFS distributed storage, generate A Hash, launch A transaction to package the Hash for bookkeeping, and the consensus will be chained. After that, enterprise A needs to periodically initiate verification of zero-knowledge proof, or delegate other nodes of CTK to initiate verification requests.
The above application scenarios have two problems in practice:
1) It can be found that the data on-chain components need to use the cloud storage technology of enterprise A when they are on the chain, and there will be expensive storage costs for storage services with relatively large amounts of data.
2) Enterprise A needs to publish data to the blockchain. Enterprises that lack incentives are often not motivated to publish data
The first problem can be fixed by using decentralized storage technology similar to IPFS, but due to the features of the alliance chain, the public IPFS nodes are often difficult to meet the backup requirements of large amounts of data, and IPFS nodes between the alliance enterprises also need some money to operate. For initiating IPFS file storage validation, you also need to be online for a long time. The second issue is the need of an effective mechanism which can stimulate data storage and data sharing.
In addition, the current blockchain technology, especially the alliance chain, is often used by medium and large enterprises.
This is mainly because the service area of the alliance chain is mainly the financial sector of the enterprise, such as bookkeeping, asset securitization, supply chain finance and so on.For small and micro businesses, there is no incentive to apply blockchain technology to their own production.
In order to solve the above two problems, we proposed the eco-application component of Code Token System, which has two purposes:
1) Effectively encourage enterprises to do data exchange and data storage.
2) Support small and micro businesses so that they can enter the whole alliance at a low cost.
newbie
Activity: 4
Merit: 0
March 10, 2020, 03:40:33 PM
#4
Code Token System will inherit independent open protocols and standards to achieve business applications through framework and dedicated modules, including the consensus mechanism and storage methods in each blockchain, as well as identity services, access control and smart contracts. The various branches of Hyperledger, including Intel's Burrow, and IBM's Fabric are all booming.
Code Token System plans to create 17 primary super nodes, 14 standby super nodes, and 2,000 normal nodes. The block data stored on the super nodes. Any user can download the super node program from GitHub for free and set up their own running environment. The program has automatically selected the top 17 super nodes based on hardware performance, block storage speed, network access speed, GPU and CPU computing speed, and recorded them in the block. In addition, Code Token System will open the race for alternate super nodes, which are 14 in all and share the same benefits as the main super node. When Code Token System suffers crushing blow, any node can restore the Code Token System network by program code which is freely downloaded.
Code Token System on the GitHub:
https://github.com/ctk-program
newbie
Activity: 4
Merit: 0
March 02, 2020, 04:42:01 PM
#3
What is Initial heart of Code Token (CTK) Systems?

Everything will support the sidechain Applications.

We will provide the first referendum in the history of blockchain to determine whether it is a free ecological sidechain support model.

This action will greatly reduce the threshold for large-scale application of the blockchain, and let sharing, democracy, and fairness more truly into each of us
legendary
Activity: 2086
Merit: 1321
Bitcoin needs you!
February 25, 2020, 03:13:55 PM
#2
 Hmmm? Advertising?
newbie
Activity: 4
Merit: 0
February 25, 2020, 03:12:44 PM
#1
The CTK System is an Enterprise Baas of the OpenSDS project and will be chartered under Linux Foundation and Hyperledger Foundation

Jump to: