一分钟读完全文
对dataframe结构数据的导入导出mysql进行了介绍,包括对mysqlclient的安装(MySQLdb在python3上的替代),以及如何利用pandas自带的to_sql快速导入数据。
安装包:Python 3 ImportError: No module named ‘ConfigParser’
一开始天真的以为pip就能解决的:
sudo pip install MySQL-python
结果,又又又又又又又又又出现的Python 3 ImportError: No module named ‘ConfigParser’是什么鬼?
SO上如是说:In Python 3,ConfigParser
has been renamed toconfigparser
for PEP 8 compliance. It looks like the package you are installing does not support Python 3.
也就是说,PY3不支持,不支持啊!!
不要慌,可以拿mysqlclient代替:
brew install mysql
pip install mysqlclient
So, what is the mysqlclinet ??
—It is a fork ofMySQL-python
with added support for Python 3.
不会brew的请参考我之前文章:Python之美——一只数据狗的笔记[长期更新]
DataFrame怎么直接导入mysql
from sqlalchemy import create_engine
import pandas as pd
import numpy as np
np.random.seed(0)
number_of_samples = 10
frame = pd.DataFrame({
'feature1': np.random.random(number_of_samples),
'feature2': np.random.random(number_of_samples),
'class': np.random.binomial(2, 0.1, size=number_of_samples),
},columns=['feature1','feature2','class'])
engine = create_engine('mysql://username:password@host/dbname')
frame.to_sql(con=engine, name='table_name_for_df', if_exists='replace')
注:mysql.connector虽然也可以连接、插入数据,但好像还不被DF里面的to_sql所支持。
参考:How to insert pandas dataframe via mysqldb into database?
读取出来的数据怎么格式化为DataFrame
engine = create_engine('mysql://username:password@host/dbname')
connection = engine.connect()
resoverall = connection.execute("SELECT * FROM sys.table_name_for_df")
df = pd.DataFrame(resoverall.fetchall())
df.columns = resoverall.keys()
参考:How to convert SQL Query result to PANDAS Data Structure?
阿里云不让一次大批量导入怎么办
在把数据写入阿里云的服务器时,大批量的数据总莫名会报错,大概是10万行以上就会出错,估计是设置了啥限制,解决方法也很简单啦,不让一次插就拆开来:
engine = create_engine('mysql://root:password@ip/sys') # 阿里云
for jj in range(0, int(len(rawdata) / 10000) + 1):
print(jj)
temraw = rawdata.iloc[jj * 10000:(jj + 1) * 10000, :]
temraw.to_sql(con=engine, name='GBM_3H_RAW', if_exists='append')
其中rawdata是你需要录入进去的DF,需要注意,此时的if_exists由之前的replace改成了append,这样才能类似于insert的功能。