Python Project Report:
Each team will document their analysis and results in a report. Report will consist of
o Cover page (title, group members)
o Executive summary
o Project motivation/background
o Key questions
o Data source
o Data description
o Data transformation/Exploratory data analysis
o Models and analysis
o Findings and managerial implications
o Appendix: python codes with proper documentations
o References, if any
One report per group for submission. Messy or hard-to-read reports will receive penalty.
Feel free to add other sections if needed. If you feel any of the above required sections
should not be included in the final report, please talk to me. Missing important parts will
lead to penalty. There is no page limit.
Each team will present their project to the class. Presentation will focus on business
problem, analysis, and implications for the business. Logistics related to project
presentation will be provided later. Executive Summary
The outcome of this report is to determine sentiments by users on the popular Twitter platform with regards to the advent of Web 3.0. In this regard, we will realize the power of text information and characterization of data by utilizing high-quality Natural Language Processing (NLP) features to develop a system that can extract relevant text data features from our text file to realize different outcomes based on the data points identified.
We believe that the next wave of computing innovation—along with entirely new sectors of the economy—will be built on decentralized technology. Decentralized technologies offer an alternative to a digital status quo that is increasingly dominated by big tech and oppressive regimes. Open, democratized systems can provide the infrastructure to power tomorrow’s economy and institutions. Realizing that potential will depend on collaboration between government and the private sector in regards to the development of regulatory frameworks that encourage innovation while managing the risks inherent in different applications.
The internet has come from a far place since its inception to the general public with web1. Web1 was an open-source internet protocol from the 1970s and 80s, which included also TCP, IP, SMTP, and HTTP. It was designed with the aim of bringing openness and inclusiveness to the market. Any person had the ability to build on top of them without permission.
After a while, it became clear that open source was very hard to monetize. This led to a business model that involved building proprietary and closed protocols on top of the already existing open protocols which came to be known as Web 2.0. Web 2.0 is a centralized platform in which services are run by a few companies like internet gatekeepers such as apple and Google (Kurgun et al., 2017). Although developers were benefitted initially, later on it became tough for them to survive due to the centralization of the platform. Several of these gatekeeper companies are now some of the most valuable in history, and while users interact for free online, we are obliged to place trust in models that mostly sell data on its users and opaque code.
We are now in the early stages of developing web3, in which communities are
incentivized and rewarded for maintaining and developing core infrastructure. Web 3.0 now came about to address the issues that Web 2.0 currently has. It is a decentralized platform with features of Web 2.0 but owned by developers and users. A web 3.0 business model is similar to the one of Web 1.0, with open source protocols but collectively owned by crypto economics (Mersch, 2019). It is independent of the traditional organizations and its code is executed as it is written. Web3 values open source, user particular ownership of their data and permission less access. This in turn creates a shared sense of identity and collaboration.
The decentralized networks of web3 offer alternatives to the broken digital status quo. While centralization has helped billions of people get access to amazing technologies, many of which were free to use, it has also stifled innovation (Dixon, 2018). Right now, companies that own networks have unilateral power over important questions like who gets network access, how revenue is divided, what features are supported, how user data is secured, and so on. That makes it harder for startups, creators and other groups to grow their internet presence because they must worry about centralized platforms changing the rules and taking away their audiences or profits.
The classic challenge of decentralized networks is that they are public goods. Without a central entity to control decisions and capture profits, it is hard to incentivize their maintenance and development. Crypto helps solve this problem through decentralized coordination and providing economic incentives for development (Dixon, 2019). Web 3.0 will put power in the hands of communities rather than corporations.
Decentralized networks are an important counter to the fragility of centralized applications. For example, in June 2021, internet users were unable to connect to top websites including The New York Times, the Guardian, Twitch, Reddit and the British government’s homepage—because a single company, Fastly, was crippled by a software bug. Decentralized systems avoid single points of failure.
Decentralized networks can also neutralize the unilateral control exerted by centralized platforms. For example, the decentralized, permanent data storage blockchain platform Arweave was used by activists in China to permanently upload copies of Hong Kong’s Apple Daily before it was blocked by censors, and to save criticism of the country’s coronavirus response before it was deleted from the social media platform Weibo.
The key questions that describe data are as follows;
1. What are the advantages of Web 3.0 over Web 2.0?
2. How safe is this network to have digital payments enabled?
3. Cryptocurrencies future due to web 3.0?
4. Creator economy prospects based on web3.0?
5. Non-fungible token prospects basing on web3.0?
6. Will artists benefit by this business model?
7. Any business models that we can think of basing on the data?
The prediction questions that might be answered based on this project are as follows;
1. What is the user perception in moving to web 3.0?
2. What is the perception about decentralized finance?
We seek to answer these questions by proposing a system that would use data to perform a sentiment analysis of people’s emotions and giving a purposeful explanation of the direction of web3.0 for the future.
The data that will be used for the research will be obtained from Twitter. We propose to use the free Twitter API available to collect the data that will be used.
How we approach the project
First we extract tweets from the twitter API under the tag web3.0 and construct a dataset with the following columns – user, tweet, polarity of tweet, date. Then we move on to data pre-processing such as lemmatization, stop-words removal, punctuation and numbers removal. The features are then extracted using TF-IDF vectorization. Next we split the data into a test set and a training set for prediction(using various predictive models e.g-naive bayes).
By answering different research questions and the prediction questions we will analyze basing on the sentiment of people on web 3.0 and its different use cases and whether this emergence will effect Big Tech Companies that are built on web 2.0 or not .
Dixon, C. (2018). Why decentralization matters. Medium. https://onezero.medium.com/why-decentralization-matters-5e3f79f7638e
Dixon, C. (2019, January 4). Blockchain can wrest the internet from corporations’ grasp. Wired. https://www.wired.com/story/how-blockchain-can-wrest-the-internet-from-corporations/
Kurgun, H., Kurgun, A., & Aktas, E. (2017, October). Web 4.0 Promise for Tourism Ecosystem: A Qualitative Research on Tourism Ecosystem Stakeholders’ Awareness. In Global Conference on Services Management (GLOSERV 2017) (Vol. 10, p. 284).
Mersch, M. (2019, April 24). Which new business models will be unleashed by Web 3.0? Medium. https://medium.com/fabric-ventures/which-new-business-models-will-be-unleashed-by-web-3-0-4e67c17dbd10