출처 : 금융 데이터 분석을 위한 파이썬 판다스
예제파일
shift함수: 상하 이동
import pandas as pd
df = pd.read_excel("DB/ss_ex_1.xlsx" , index_col=0)
df.index = pd.to_datetime(df.index)
df = df.sort_index()
df1 = df["거래량"].shift(1)
print(df1)
일자
2021-02-15 NaN
2021-02-16 23529706.0
2021-02-17 20483100.0
2021-02-18 18307735.0
2021-02-19 21327683.0
...
2021-08-09 13342623.0
2021-08-10 15522581.0
2021-08-11 20362639.0
2021-08-12 30241137.0
2021-08-13 42365223.0
Name: 거래량, Length: 127, dtype: float64
Process finished with exit code 0
df["전일거래량"] = df["거래량"].shift(1) # 음수 - 상향 이동
df1 = df[ ['거래량', '전일거래량'] ]
print(df1)
거래량 전일거래량
일자
2021-02-15 23529706.0 NaN
2021-02-16 20483100.0 23529706.0
2021-02-17 18307735.0 20483100.0
2021-02-18 21327683.0 18307735.0
2021-02-19 25880879.0 21327683.0
... ... ...
2021-08-09 15522581.0 13342623.0
2021-08-10 20362639.0 15522581.0
2021-08-11 30241137.0 20362639.0
2021-08-12 42365223.0 30241137.0
2021-08-13 61270643.0 42365223.0
[127 rows x 2 columns]
df["전일거래량"] = df["거래량"].shift(1)
cond = df["거래량"] > df["전일거래량"]
df2=df[cond]
print(df2)
종가 대비 등락률 ... 시가총액 상장주식수 전일거래량
일자 ...
2021-02-18 82100.0 -1100.0 -1.32 ... 4.901191e+14 5.969783e+09 18307735.0
2021-02-19 82600.0 500.0 0.61 ... 4.931040e+14 5.969783e+09 21327683.0
2021-02-24 82000.0 0.0 0.00 ... 4.895222e+14 5.969783e+09 20587314.0
2021-02-25 85300.0 3300.0 4.02 ... 5.092225e+14 5.969783e+09 26807651.0
2021-02-26 82500.0 -2800.0 -3.28 ... 4.925071e+14 5.969783e+09 34155986.0
... ... ... ... ... ... ... ...
2021-08-09 81500.0 0.0 0.00 ... 4.865373e+14 5.969783e+09 13342623.0
2021-08-10 80200.0 -1300.0 -1.60 ... 4.787766e+14 5.969783e+09 15522581.0
2021-08-11 78500.0 -1700.0 -2.12 ... 4.686279e+14 5.969783e+09 20362639.0
2021-08-12 77000.0 -1500.0 -1.91 ... 4.596733e+14 5.969783e+09 30241137.0
2021-08-13 74400.0 -2600.0 -3.38 ... 4.441518e+14 5.969783e+09 42365223.0
[66 rows x 11 columns]
diff함수: 이전 인덱스 데이터와의 차이
print("영업일:", len(df2)) # 영업일: 66
df3=df2['거래량'].diff( ) # 이전 인덱스 데이터와의 차이
print(df3)
일자
2021-02-18 NaN
2021-02-19 4553196.0
2021-02-24 926772.0
2021-02-25 7348335.0
2021-02-26 4364814.0
...
2021-08-09 -10119787.0
2021-08-10 4840058.0
2021-08-11 9878498.0
2021-08-12 12124086.0
2021-08-13 18905420.0
Name: 거래량, Length: 66, dtype: float64
cond = df['거래량'].diff() > 0
df4=df[cond]
print(df4)
종가 대비 등락률 ... 시가총액 상장주식수 전일거래량
일자 ...
2021-02-18 82100.0 -1100.0 -1.32 ... 4.901191e+14 5.969783e+09 18307735.0
2021-02-19 82600.0 500.0 0.61 ... 4.931040e+14 5.969783e+09 21327683.0
2021-02-24 82000.0 0.0 0.00 ... 4.895222e+14 5.969783e+09 20587314.0
2021-02-25 85300.0 3300.0 4.02 ... 5.092225e+14 5.969783e+09 26807651.0
2021-02-26 82500.0 -2800.0 -3.28 ... 4.925071e+14 5.969783e+09 34155986.0
... ... ... ... ... ... ... ...
2021-08-09 81500.0 0.0 0.00 ... 4.865373e+14 5.969783e+09 13342623.0
2021-08-10 80200.0 -1300.0 -1.60 ... 4.787766e+14 5.969783e+09 15522581.0
2021-08-11 78500.0 -1700.0 -2.12 ... 4.686279e+14 5.969783e+09 20362639.0
2021-08-12 77000.0 -1500.0 -1.91 ... 4.596733e+14 5.969783e+09 30241137.0
2021-08-13 74400.0 -2600.0 -3.38 ... 4.441518e+14 5.969783e+09 42365223.0
[66 rows x 11 columns]
# 절대 모멘텀(momentum) 투자 전략
yeild = df['종가'] / df['종가'].shift(6)
cond = yeild >= 1.03
print(len(df[cond]))
cond_modified = cond.shift(1).fillna(False)
s = df.loc[cond_modified, '종가'] / df.loc[cond_modified, '시가']
print(s.cumprod().iloc[-1])
이동평균
import pandas as pd
df = pd.read_excel("DB/ss_ex_1.xlsx", index_col=0)
df.index = pd.to_datetime(df.index)
df = df.sort_index()[["종가"]]
df['종가D-1'] = df['종가'].shift(1)
df['종가D-2'] = df['종가'].shift(2)
df['ma3'] = (df['종가'] + df['종가D-1'] + df['종가D-2']) / 3
print(df.head())
종가 종가D-1 종가D-2 ma3
일자
2021-02-15 84200.0 NaN NaN NaN
2021-02-16 84900.0 84200.0 NaN NaN
2021-02-17 83200.0 84900.0 84200.0 84100.000000
2021-02-18 82100.0 83200.0 84900.0 83400.000000
2021-02-19 82600.0 82100.0 83200.0 82633.333333
df['rolling3'] = df['종가'].rolling(3).mean()
print(df.head())
종가 종가D-1 종가D-2 ma3 rolling3
일자
2021-02-15 84200.0 NaN NaN NaN NaN
2021-02-16 84900.0 84200.0 NaN NaN NaN
2021-02-17 83200.0 84900.0 84200.0 84100.000000 84100.000000
2021-02-18 82100.0 83200.0 84900.0 83400.000000 83400.000000
2021-02-19 82600.0 82100.0 83200.0 82633.333333 82633.333333
import pandas as pd
df = pd.read_excel("DB/ss_ex_1.xlsx", index_col=0)
df.index = pd.to_datetime(df.index)
df['ma5'] = df['종가'].rolling(5).mean().shift(1)
cond = df['ma5'] < df['시가']
print("상승일:", len(df[cond])) # 상승일: 76
print("영업일:", len(df)) # 영업일: 127
지수이동평균(exponential moving average, EMA)
: 최근 데이터에 높은 가중치를 부여하는 지수이동평균
# EMA(i) = k * price(i) + (1-k) * EMA(i-1)
# N : 일단위
# k=2/(N+1)
from pandas import Series
data = [84200, 84900, 83200, 82100, 82600]
index = ["2021-02-15", "2021-02-16", "2021-02-17", "2021-02-18", "2021-02-19"]
s = Series(data, index)
s1 = s.ewm(span=3, adjust=False).mean()
print(s1)
2021-02-15 84200.00
2021-02-16 84550.00
2021-02-17 83875.00
2021-02-18 82987.50
2021-02-19 82793.75
dtype: float64
import pandas as pd
df = pd.read_excel("DB/ss_ex_1.xlsx" , index_col=0)
df.index = pd.to_datetime(df.index)
df = df.sort_index()
s=df['종가']
df['EWA'] = s.ewm(span=3,adjust=False).mean().shift(1)
cond = df['EWA'] < df['시가']
cond_modified = cond.fillna(False)
수익률 = df.loc[cond_modified,'종가']/df.loc[cond_modified,'시가']
print(수익률.cumprod().iloc[-1]) # 0.8724202765542763
'DataFrame' 카테고리의 다른 글
plot (0) | 2022.04.12 |
---|---|
데이터 샘플링 (0) | 2022.04.09 |
TimeStamp (0) | 2022.04.09 |
판다스 데이터프레임 (데이터 읽기/ 저장하기) (0) | 2022.04.07 |
판다스 데이터프레임 (고급기능2) (0) | 2022.04.07 |
댓글