how to install and configure presto

Hi All subscribers today we will show you how to install and configure presto

Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes.

Presto was designed and written from the ground up for interactive analytics and approaches the speed of commercial data warehouses while scaling to the size of organizations like Facebook.

presto

Prerequisites 

* Java installed server, if not installed follow this link

Installation

Step 1: Download latest presto server from the follwoing link

https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.204/presto-server-0.204.tar.gz

cd /opt/

wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.204/presto-server-0.204.tar.gz

Step 2: Extract the download file

tar -xvzf presto-server-0.204.tar.gz

mv presto-server-0.204 presto

cd presto

Step 2: Presto needs a data directory for storing logs, etc. We recommend creating a data directory outside of the installation directory, which allows it to be easily preserved when upgrading Presto.

Configurating Presto

Create an etc directory inside the installation directory. This will hold the following configuration:

Node Configuration

Node Properties: environmental configuration specific to each node
JVM Config: command line options for the Java Virtual Machine
Config Properties: configuration for the Presto server
Catalog Properties: configuration for Connectors (data sources)

mkdir etc

touch etc/node.properties

The node properties file, etc/node.properties, contains configuration specific to each node. A node is a single installed instance of Presto on a machine. This file is typically created by the deployment system when Presto is first installed. The following is a minimal etc/node.properties:

node.environment=production
node.id=ffffffff-ffff-ffff-ffff-ffffffffffff
node.data-dir=/var/presto/data

node.environment: The name of the environment. All Presto nodes in a cluster must have the same environment name.
node.id: The unique identifier for this installation of Presto. This must be unique for every node. This identifier should remain consistent across reboots or upgrades of Presto. If running multiple installations of Presto on a single machine (i.e. multiple nodes on the same machine), each installation must have a unique identifier.
node.data-dir: The location (filesystem path) of the data directory. Presto will store logs and other data here.

JVM Config

touch etc/jvm.config

The JVM config file, etc/jvm.config, contains a list of command line options used for launching the Java Virtual Machine. The format of the file is a list of options, one per line. These options are not interpreted by the shell, so options containing spaces or other special characters should not be quoted.

The following provides a good starting point for creating etc/jvm.config:

-server
-Xmx16G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:+ExitOnOutOfMemoryError

Because an OutOfMemoryError will typically leave the JVM in an inconsistent state, we write a heap dump (for debugging) and forcibly terminate the process when this occurs.

 

Config Properties
The config properties file, etc/config.properties, contains the configuration for the Presto server. Every Presto server can function as both a coordinator and a worker, but dedicating a single machine to only perform coordination work provides the best performance on larger clusters.

touch etc/config.properties

The following is a minimal configuration for the coordinator:

coordinator=true
node-scheduler.include-coordinator=false
http-server.http.port=8080
query.max-memory=50GB
query.max-memory-per-node=1GB
discovery-server.enabled=true
discovery.uri=http://example.net:8080

And this is a minimal configuration for the workers:

coordinator=false
http-server.http.port=8080
query.max-memory=50GB
query.max-memory-per-node=1GB
discovery.uri=http://example.net:8080

Alternatively, if you are setting up a single machine for testing that will function as both a coordinator and worker, use this configuration:

coordinator=true
node-scheduler.include-coordinator=true
http-server.http.port=8080
query.max-memory=5GB
query.max-memory-per-node=1GB
discovery-server.enabled=true
discovery.uri=http://example.net:8080

Log Levels
The optional log levels file, etc/log.properties, allows setting the minimum log level for named logger hierarchies. Every logger has a name, which is typically the fully qualified name of the class that uses the logger. Loggers have a hierarchy based on the dots in the name (like Java packages). For example, consider the following log levels file:

com.facebook.presto=INFO

This would set the minimum level to INFO for both com.facebook.presto.server and com.facebook.presto.hive. The default minimum level is INFO (thus the above example does not actually change anything). There are four levels: DEBUG, INFO, WARN and ERROR.

Catalog Properties
Presto accesses data via connectors, which are mounted in catalogs. The connector provides all of the schemas and tables inside of the catalog. For example, the Hive connector maps each Hive database to a schema, so if the Hive connector is mounted as the hive catalog, and Hive contains a table clicks in database web, that table would be accessed in Presto as hive.web.clicks.

Catalogs are registered by creating a catalog properties file in the etc/catalog directory. For example, create etc/catalog/jmx.properties with the following contents to mount the jmx connector as the jmx catalog:

connector.name=jmx

for more connectors check here 

Running Presto

The installation directory contains the launcher script in bin/launcher. Presto can be started as a daemon by running the following:

bin/launcher start
Alternatively, it can be run in the foreground, with the logs and other output being written to stdout/stderr (both streams should be captured if using a supervision system like daemontools):

bin/launcher run

Now check in browser  http://<IPaddress>:8080

Reference: https://prestodb.io/docs/current/installation/deployment.html

© 2018, Techrunnr. All rights reserved.

#1
#2
#3
Questions Answered
Articles Written
Overall Points

Prabhin Prabharkaran

He is Technical professional. He is a person who loves to share tricks and tips on the Internet. He Posts what he does!!

0 Comments

Leave a Reply