一、问题
在使用Jupyter Notebook读取Json格式的数据时,有时候读取出来的数据格式会混乱,因此可以借助第三方库tabulate将读取到的数据转换为表格形式从而美化输出。
二、数据
- 测试数据格式如下:
- 测试数据
# 测试数据
courses =[{ "study":3450,"duration": 4680,"name": "课程1","id": 8164,"visit": 1597,"createDate": "2018-01-22","comments": 11, "credits": 4},{"study":160,"duration": 2467,"name": "课程2","id": 18057,"visit": 603,"createDate": "2020-06-12","comments": 14, "credits":7.5},{"study":1708,"duration": 0,"name": "课程3","id": 11555,"visit": 7779,"createDate": "2018-11-30","comments": 1, "credits":0},{
"study":435,"duration": 6713,"name": "课程4","id": 11793,"visit": 3147,"createDate": "2019-01-19","comments": 0, "credits":0},{
"study":0,"duration": 12462,"name": "课程5","id": 12104,"visit": 0,"createDate": "2019-04-08","comments": 0, "credits":0},{"study":1,"duration": 468,"name": "课程6","id": 12592,"visit": 4,"createDate": "2019-06-05","comments": 0, "credits":0},{"study":3514,"duration": 23505,"name": "课程7","id": 14459,"visit": 104005,"createDate": "2020-03-30","comments": 22, "credits":14},{"study":145,"duration": 3715,"name": "课程8","id": 16618,"visit": 908,"createDate": "2020-05-22","comments": 0, "credits":0},{"study":135,"duration": 3741,"name": "课程9","id": 16621,"visit": 813,"createDate": "2020-05-22","comments": 4, "credits":0,},{"study":257,"duration": 3891,"name": "课程10","id": 17588,"visit": 1134,"createDate": "2020-06-04","comments": 8, "credits":3.5},{"study":5005,"duration": 2175,"name": "课程11","id": 11205,"visit": 29907,"createDate": "2018-09-26","comments": 24, "credits":4},]
三、直接读取
import pandas as pd
df= pd.DataFrame(courses)# 打印数据
print(df.head())
- 效果如下:
字段与具体的字段内容对不齐,如果具体内容长一点,会更加混乱。
四、解决
使用 tabulate库美化输出
- 安装tabulate 库
pip install tabulate
- 导入库
from tabulate import tabulate
import pandas as pd
- 重新读取数据
# 读取数据
df = pd.DataFrame(courses)
data_subset = df[['name', 'study', 'visit', 'days_online', 'duration','comments','credits']].head().values.tolist()
# 将英文字段替换成中文的
headers = ['课程名称', '学习人数', '学习人次', '上架天数(天)','时长(秒)','评论数','学分']
# 转换为列表并美化输出
print(tabulate(data_subset, headers=headers, tablefmt='grid', maxcolwidths=40))
- 效果如下:
(注:这里的上架天数需要额外计算,此处不展开,主要看读取的效果!)