Spark提交任务
提交任务
带着依赖包提交
./bin/spark-submit --class sparkJava.SparkOnHive_SqlContext \
--master spark://192.0.0.0:7077 \
--driver-memory 1g --executor-memory 1g --total-executor-cores 1 \
--jars (依赖的jar,多个的话用英文逗号隔开)
jarpath
提交到yarn
./bin/spark-submit \
--class net.ctzcdn.processmonitor.monitored.WordContSpark \
--driver-memory 600m \
--executor-cores 1 \
--executor-memory 1g \
--num-executors 1 \
--master yarn \
--deploy-mode cluster myprocess-1.0.jar
后台提交
nohup ./bin/spark-submit --class net.ctzcdn.processmonitor.monitored.WordCountScala
--driver-memory 600m
--executor-cores 1
--executor-memory 1g
--num-executors 1
--master yarn
--deploy-mode cluster
myprocess-1.0.jar >> /home/adminuser/soft/scalaWordCount01.log &
普普通通提交
./spark-submit \
--master spark://openstack-030.cdnyf.io:7077 \
--class org.apache.spark.examples.SparkPi \
--executor-memory 1g \
--total-executor-cores 1 \
/home/adminuser/app/spark-2.3.1-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.3.1.jar 100
Shell启动yarn模式
./bin/spark-shell --master yarn --deploy-mode client
开启thriftserver服务
1. 配置thriftserver2的ip地址和端口号
修改hive-site.xml文件
hive.server2.thrift.port=10000
hive.server2.thrift.bind.host=localhost
2. 集成Hive环境(类似SparkSQL)
3. 启动服务
$ sbin/start-thriftserver.sh
$ sbin/stop-thriftserver.sh
Java连接数据库
public class SparkSqlCli {
|