1. 已知字符串 “hello_world_spark” ,如何按 “_” 进行分割?
str = "hello_world_spark"
print(str.split("_"))
2. python 打印99乘法表?
for i in range(1,10):for j in range(1, i+1):print(f"{i}×{j}={i*j}",end=" ")print()
3. 编写test函数,功能是找出单词 “welcome” 在字符串 “Hello, welcome to my world.” 中出现的位置,找不到返回-1。
def test(str):word = "welcome"return str.find(word)
print(test("Hello,welcome to my world."))
4. 求1000以内水仙花数(3位数)。
num = 0
for i in range(100, 1000+1):num_str = str(i)g = int(num_str[2])s = int(num_str[1])b = int(num_str[0])if g**3 + s**3 + b**3 == i:print(i)num+=1
print("总数:", num)
5. 找出列表a = [“hello”, “world”, “spark”, “congratulations”] 中单词最长的一个。
a = ["hello", "world", "spark", "congratulations"]
max_word = max(a, key=len)
print(max_word)
6、测试第一个Jupyter+Spark代码。
安装相关包:
pip install findspark -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install pypandoc -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install pyspark --trusted-host mirrors.aliyun.com
import findspark
from pyspark import SparkConf, SparkContext
findspark.init()
conf = SparkConf().setMaster("local").setAppName("My App")
sc = SparkContext(conf = conf)
logFile = "E:\\1.txt"
logData = sc.textFile(logFile, 2).cache() # 读取数据——》弹性分布数据集(2个分区)——》缓存至内存
numAs = logData.filter(lambda line: 'a' in line).count() # 保留包含a的行
numBs = logData.filter(lambda line: 'b' in line).count()
print('Lines with a: %s, Lines with b: %s' % (numAs, numBs))