Skip to content

HIVE-29419: Provide a Hive-specific docker image for Tez AM#6435

Open
abstractdog wants to merge 1 commit intoapache:masterfrom
abstractdog:HIVE-29419-tez-am-image
Open

HIVE-29419: Provide a Hive-specific docker image for Tez AM#6435
abstractdog wants to merge 1 commit intoapache:masterfrom
abstractdog:HIVE-29419-tez-am-image

Conversation

@abstractdog
Copy link
Copy Markdown
Contributor

@abstractdog abstractdog commented Apr 15, 2026

What changes were proposed in this pull request?

Make hive image able to start a TezAM in LLAP mode that can assign tasks to the LLAP daemons.

Why are the changes needed?

Because it's the next step to have a fully distributed, Dockerized environment for Hive.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Manually, steps are below.

Needs Hive 4.3.0-SNAPSHOT jars that contain recent changes (specifically HIVE-29477):

# assuming that you're standing on hive master or rather the code of this PR:
mvn clean install -DskipTests -Pdist # this will take a long, sorry, we need a snapshot jar
cp packaging/target/apache-hive-4.3.0-SNAPSHOT-bin.tar.gz packaging/cache

start cluster:

export HIVE_VERSION=4.3.0-SNAPSHOT
export POSTGRES_LOCAL_PATH=...your_postgres_driver_path...

# hive: 4.3.0-SNAPSHOT is mandatory
# tez: 1.0.0-SNAPSHOT is mandatory to override released tez jars with the currently unreleased 1.0.0 tez jars that can talk to unmanaged tez sessions
./build.sh -hive 4.3.0-SNAPSHOT -hadoop 3.4.1 -tez 0.10.5 -tez-snapshot 1.0.0-SNAPSHOT

./start-hive.sh --llap
docker compose --profile llap logs -f

test:

 beeline -u 'jdbc:hive2://localhost:10000/' -e "DROP table IF EXISTS iceberg_table; CREATE TABLE iceberg_table (id BIGINT) STORED BY iceberg; INSERT INTO iceberg_table VALUES(1);"

see logs that queries go through tezam and daemons:

tezam         | 2026-04-15T16:08:05,906 INFO  DAGAppMaster - Starting DAG submitted via RPC: INSERT INTO iceberg_table VALUES(1) (Stage-1)

...

llapdaemon-1  | 2026-04-15T16:08:06,368  INFO [Task-Executor-0 (1776269274583_0000_1_00_000000_0)] impl.LlapTaskReporter: Registered counters for fragment: 1776269274583_0000_1_00_000000_0 vertexName: Map 1

...

hiveserver2   | 2026-04-15T16:08:07,267  INFO [HiveServer2-Background-Pool: Thread-117] SessionState: Status: DAG finished successfully in 1.23 seconds

very important test case is that the container layout implemented here in docker-compose is compatible with the already existing and working hs2+llapdeamon setup (no tezam), which is confirmed as:

 ./stop-hive.sh --cleanup
 
export POSTGRES_LOCAL_PATH=~/.m2/repository/org/postgresql/postgresql/42.7.3/postgresql-42.7.3.jar
./build.sh -hive 4.2.0 -hadoop 3.4.1 -tez 0.10.5
./start-hive.sh --llap
docker compose --profile llap logs -f

beeline -u 'jdbc:hive2://localhost:10000/' -e "DROP table IF EXISTS iceberg_table; CREATE TABLE iceberg_table (id BIGINT) STORED BY iceberg; INSERT INTO iceberg_table VALUES(1);"

in which case tezam will simply fail to start, and the cluster works exactly the same way as post-HIVE-29411
be aware of the difference, in case of:

./build.sh -hive 4.2.0 -hadoop 3.4.1 -tez 0.10.5

the tez zookeeper-based registry and external sessions code are simply not present, that's why - regardless of the config - here it can work

@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants