首页分享 Cannot run multiple SparkContexts at once; existing SparkContext(app=PySparkShell, maste

Cannot run multiple SparkContexts at once; existing SparkContext(app=PySparkShell, maste

来源：花匠小妙招时间：2024-12-12 13:50

最新推荐文章于 2024-09-09 07:30:00 发布

Rachel_nana 于 2019-08-28 16:13:00 发布

运行SparkContext报错：

ValueError: Cannot run multiple SparkContexts at once; existing SparkContext(app=PySparkShell, master=local[*]) created by <module> at /usr/local/spark/python/pyspark/shell.py:59

hadoop@rachel-virtual-machine:/usr/local/spark/bin$ ./pyspark

./pyspark: 行 45: python: 未找到命令

Python 3.6.8 (default, Jan 14 2019, 11:02:34)

[GCC 8.0.1 20180414 (experimental) [trunk revision 259383]] on linux

Type "help", "copyright", "credits" or "license" for more information.

2019-08-28 15:27:12 WARN Utils:66 - Your hostname, rachel-virtual-machine resolves to a loopback address: 127.0.1.1; using 192.168.80.128 instead (on interface ens33)

2019-08-28 15:27:12 WARN Utils:66 - Set SPARK_LOCAL_IP if you need to bind to another address

2019-08-28 15:27:23 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Setting default log level to "WARN".

To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).

Welcome to

____ __

/ __/__ ___ _____/ /__

_ / _ / _ `/ __/ '_/

/__ / .__/_,_/_/ /_/_ version 2.3.3

/_/

Using Python version 3.6.8 (default, Jan 14 2019 11:02:34)

SparkSession available as 'spark'.

>>> from pyspark import SparkContext

>>> sc = SparkContext( 'local', 'test')

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

File "/usr/local/spark/python/pyspark/context.py", line 129, in __init__

SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)

File "/usr/local/spark/python/pyspark/context.py", line 328, in _ensure_initialized

callsite.function, callsite.file, callsite.linenum))

ValueError: Cannot run multiple SparkContexts at once; existing SparkContext(app=PySparkShell, master=local[*]) created by <module> at /usr/local/spark/python/pyspark/shell.py:59

出现这个错误是因为之前已经启动了SparkContext，所以需要先关闭spark，然后再启动。

>>> sc.stop()

>>> sc=SparkContext("local","test")

>>> logFile = "file:///usr/local/spark/README.md"

>>> logData = sc.textFile(logFile, 2).cache()

>>> numAs = logData.filter(lambda line: 'a' in line).count()

>>> numBs = logData.filter(lambda line: 'b' in line).count()

>>> print('Lines with a: %s, Lines with b: %s' % (numAs, numBs))

Lines with a: 61, Lines with b: 30