-
Notifications
You must be signed in to change notification settings - Fork 33
/
Copy pathInstructions
78 lines (62 loc) · 5.15 KB
/
Instructions
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
# STOCK MARKET DATA ANALYSIS
### Use Python to scrape data and join with financial data from Yahoo Finance (or another finance) API. Use data manipulation and visualization for financial and investment analysis (i.e. compare rates of return, calculate risk, build trading algorithms, and make investment decisions).
#### First, this project will start with technical analysis, measuring stock price data for movement (volatility) and volume.
#### Second, I plan to include a branch with fundamental analysis, to look more in depth at financial statement analysis.
#### Third, I am interested in expanding the analysis to include Python for cryptocurrencies, such as financial and investment analysis for ICOs, and predicting crypto prices.
##### <b>DISCLAIMER:</b> I am not a financial adviser nor a CPA. This project is for educational purposes only. Investing of any kind involves risk. Although it is possible to minimize risk, your investments are solely your responsibility. It is very important to conduct your own research. I am merely sharing my project/opinion with no guarantee of either gains or losses on any investments.
# <b> Requirements & Overview:</b>
### Build a webscraper to get data for S&P500 List</b>
### Use Yahoo Finance (or other) API for financial data
### Use Pandas to join stock tickers with financial data
### Analyze data with various quantitative models to calculate risk and make investment decisions.
#### - Download data as csv and read.
#### - Build quantitative models to predict returns & evaluate risk.
#### - Run basic Value at Risk (VaR) calculations, Monte Carlo Simulations
#### - Looking at news sentiment as a proxy for market volatility.
#### - Graph/visualize data.
##### <b> NOTE:</b> If you are new to Python, check out the Python Programming Fundamentals website for tutorials at https://pythonprogramming.net/introduction-learn-python-3-tutorials/using. You will need to review up to installing Python packages and modules with pip.
## <b>Packages required for this project:</b>
### pandas
### pandas-datareader
### beautifulsoup4
### scikit-learn
#### * Dependencies & modules you might have to install: NumPy, Matplotlib, mplfinance, yahoo (or another finance API)
### <b> Use Pandas/NumPy, Matplotlib for data manipulation and visualization.</b>
### Scrape stock tickers from web (i.e. Wikipedia)
### For this project, the SP500 list is from: "List of S&P 500 Companies" at
https://en.wikipedia.org/wiki/List_of_S%26P_500_companies
### Use Requests to scrape data for Beautiful Soup to parse
### Using this soup object, navigate/search HTML for data you want to pull.
## <b>INSTRUCTIONS:</b>
## <b>Step 1:</b> Intro to Using Python for Finance
### How to get and work with stock market data.
#### imports: pandas, pandas-datareader, numpy matplotlib, beautifulsoup4, sklearn, datetime
## <b>Step 2:</b> Handling and Graphing Data
### Basic data handlding with Pandas, and using Matplotlib for data visualization. You can use another financial data API. Yahoo Finance is used in this step.
## <b>Step 3:</b> Stock Data Manipulation
### Addiitonal data visualization, such as calulating moving averages and creating candlestick charts. Resampling with Pandas. Plotting data with subplots using shared axis.
#### New Imports: mplfinance, candlestick_ohlc
## <b>Step 4:</b> S&P500 List Automation
### Get the S&P500 data for all companies included in this index. In general, this list is the top 500 most valuable U.S. companies by market capitalization (calculated as number of outstanding shares multiplied by stock price per share). The list used here is as of date of 5/24/2020 (from https://en.wikipedia.org/wiki/List_of_S%26P_500_companies). Build webscraper using Beautiful Soup and parse data, to grab stock tickers and download as csv. Use pickle to serialize S&P500 list.
#### New Imports: beautifulsoup4, pickle, requests
## <b>Step 5:</b> S&P500 Stock Price Data
### Get stock price data for all companies from step 4, S&P500 list. Create directory with daily stock price data from 1/1/2000 to 5/24/2020 (or your current date) for all 500 companies for the S&P500 index.
#### New Imports: os
## <b>Step 6:</b> Combine DataFrames for S&P500 List and Stock Price Data
### Join the stock tickers dataframe with the stock price dataframe, save into one csv. This csv will have daily stock price close from 1/1/2000 to 5/24/2020 (or your current date) for all 500 companies in the S&P500 index.
## <b>Step 7:</b> Building S&P500 Correlation Table
### Using the combined S&P500 csv file, we will look for relationships in this dataset. There are missing values, because the data is pulled from 1/1/2020 to 5/24/2020 (or your current date). So, this example will be farily basic. Eventually, we will look at specific time frames for deeper analysis.
####
# Python for Machine Learning
## <b>Step 8:</b> ML: Preprocess Stock Market Data
### After manipulating data in steps 1-7, we can come up with a system for investment decisions. We can apply machine learning and deep learning to invest in companies, and back-test our trading strategy.
####
## <b>Step 9:</b> ML: Create Target Function
###
####
## <b>Step 10:</b> ML: Create Lables
###
####
## <b>Step 11:</b> ML
###
####