Spark提交任务

提交任务

带着依赖包提交

./bin/spark-submit --class sparkJava.SparkOnHive_SqlContext \

--master spark://192.0.0.0:7077 \

--driver-memory 1g --executor-memory 1g --total-executor-cores 1 \

--jars (依赖的jar,多个的话用英文逗号隔开)

jarpath

 

提交到yarn

./bin/spark-submit \

--class  net.ctzcdn.processmonitor.monitored.WordContSpark \

--driver-memory 600m \

--executor-cores 1 \

--executor-memory 1g \

--num-executors 1 \

--master yarn \

--deploy-mode cluster myprocess-1.0.jar

后台提交

nohup ./bin/spark-submit --class net.ctzcdn.processmonitor.monitored.WordCountScala 

--driver-memory 600m 

--executor-cores 1

--executor-memory 1g

--num-executors 1

--master yarn 

--deploy-mode cluster

myprocess-1.0.jar >> /home/adminuser/soft/scalaWordCount01.log &

 

普普通通提交

./spark-submit \

--master spark://openstack-030.cdnyf.io:7077 \

--class org.apache.spark.examples.SparkPi \

--executor-memory 1g \

--total-executor-cores 1 \

/home/adminuser/app/spark-2.3.1-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.3.1.jar 100

Shell启动yarn模式

./bin/spark-shell --master yarn --deploy-mode client

 

开启thriftserver服务

1. 配置thriftserver2ip地址和端口号
    修改hive-site.xml文件
    hive.server2.thrift.port=10000
    hive.server2.thrift.bind.host=localhost
  2. 集成Hive环境(类似SparkSQL)
  3. 启动服务
    $ sbin/start-thriftserver.sh
    $ sbin/stop-thriftserver.sh

Java连接数据库

public class SparkSqlCli {
   
public static void main(String[] args) throws Exception {
        Class.forName(
"org.apache.hive.jdbc.HiveDriver");
        Connection conn = DriverManager
                .getConnection(
"jdbc:hive2://1.18.12.13:10000", "adminuser", "");
        Statement st = conn.createStatement();
        ResultSet rs = st.executeQuery(
"show databases");
       
while (rs.next()){
            String string = rs.getString(
1);
            System.
out.println(string);
        }
    }
}