I got this error when running rollout.sh: <div class="snippet-clipboard-content no

This isn't an issue with the TPC-DS s but your configuration forH

I see the code says this: <div class="snippet-clipboard-content notranslate positi

Could not allocate enough memory about tpc-ds HOT 8 CLOSED

runningjon commented on July 18, 2024

Could not allocate enough memory

from tpc-ds.

Comments (8)

RunningJon commented on July 18, 2024

This isn't an issue with the TPC-DS scripts but your configuration for HAWQ. You don't have enough memory configured for HAWQ. How many nodes? How much RAM per node? How much swap per node? How much RAM is being used by other processes? Are you using using YARN or the default resource manager? Are you using randomly distributed tables? What are the values for these GUCs? hawq_rm_stmt_vseg_memory hawq_rm_memory_limit_perseg default_hash_table_bucket_number hawq_rm_nvseg_perquery_perseg_limit And your OS settings: vm.overcommit_ratio vm.overcommit

…

On Sun, Jan 8, 2017 at 6:20 PM, dimon777 ***@***.***> wrote: I got this error when running rollout.sh: Started gpfdist on port 9005 executing on hadoop01.home.net ./start_gpfdist.sh 9006 /data/hawq/segmentdd/pivotalguru_6 Started gpfdist on port 9006 psql -v ON_ERROR_STOP=ON -f /home/hawq/TPC-DS/04_load/051.insert.call_center.sql | grep INSERT | awk -F ' ' '{print $3}' psql:/home/hawq/TPC-DS/04_load/051.insert.call_center.sql:1: ERROR: Could not allocate enough memory! bucket number of result hash table and external table should match each other (cdbdatalocality.c:4203) Please advise. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#7>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFObREtHzszu8RV-pxFBLuHKIVC2PMN3ks5rQX1UgaJpZM4Ld2oP> .

from tpc-ds.

dimon777 commented on July 18, 2024

How many nodes? 6 datanodes, 8 nodes total
How much RAM per node? 8GB
How much swap per node? 16GB
How much RAM is being used by other processes? 1.3GB is used by CDH processes on each node. 6.5GB is free.
Are you using using YARN or the default resource manager? For Hawq the default RM is used.
Are you using randomly distributed tables? I am not sure about this, since it is TPC-DS generated tables
What are the values for these GUCs?
hawq_rm_stmt_vseg_memory 128mb
hawq_rm_memory_limit_perseg 4GB
default_hash_table_bucket_number 24
hawq_rm_nvseg_perquery_perseg_limit 6

OS settings:
vm.overcommit_ratio = 50
vm.overcommit = this parameter doesn't exist in CentOS7.1

Thank you.

from tpc-ds.

dimon777 commented on July 18, 2024

I see the code says this:

if ((context.resultRelationHashSegNum < context.externTableForceSegNum
		&& context.externTableForceSegNum != 0)
		|| (context.resultRelationHashSegNum < context.externTableLocationSegNum)) {
	elog(ERROR, "Could not allocate enough memory! "
		"bucket number of result hash table and external table should match each other");
}

What should I adjust to avoid this assertion? I don't know how to translate these context parameters to GUCs (if my case is indeed a misconfiguration).

from tpc-ds.

RunningJon commented on July 18, 2024

When you run tpcds.sh, it creates the tpcds_variables.sh file. One of the variables is "RANDOM_DISTRIBUTION" which is set to false by default. I have this default because this is used for Greenplum more frequently and hash distribution works best in that database. For HAWQ, you'll want to set this variable to "true" so that it creates the tables with random distribution. The default scale is also 3TB which I'm not sure is what you want to use. Change GEN_DATA_SCALE to a value that makes sense for your cluster. CDH isn't supported. Cloudera isn't a member of ODPi probably because they don't want to compete directly with HAWQ. CentOS 7.1 is also not supported so you'll want to use RHEL 6. But, vm.overcommit just needs to be added to /etc/sysctl.conf and it should be set to 2. It should be set to this on all nodes. Hortonworks is a member of ODPi and you can also use the Apache ODPi binaries too. You probably have more than just 1.3GB of memory being used for other processes. You probably have about 4GB allocated for the data node and the node manager plus about 2GB for the OS. In this case, you should set hawq_rm_memory_limit_perseg should be 2. 8GB - 6GB = 2GB default_hash_table_bucket_number should be 6 nvseg per segment * 8 nodes = 48

…

On Sun, Jan 8, 2017 at 8:06 PM, dimon777 ***@***.***> wrote: I see the code says this: if ((context.resultRelationHashSegNum < context.externTableForceSegNum && context.externTableForceSegNum != 0) || (context.resultRelationHashSegNum < context.externTableLocationSegNum)) { elog(ERROR, "Could not allocate enough memory! " "bucket number of result hash table and external table should match each other"); } What should I adjust to avoid this assertion? I don't know how to translate these context parameters to GUCs (if my case is indeed a misconfiguration). — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#7 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFObRMvu5AcPIAnzXSdhVEWZ4oYPjbMlks5rQZY-gaJpZM4Ld2oP> .

from tpc-ds.

dimon777 commented on July 18, 2024

Thank you,

I will try these. Unfortunately I have no control to use CDH or Horton or CentOS version (v7 seems to be supported: https://cwiki.apache.org/confluence/display/HAWQ/Build+and+Install) . In my tests I have not seen any issues with this configuration. My goal is to test HAWQ on the available Hadoop platform and compare it with Impala.

from tpc-ds.

RunningJon commented on July 18, 2024

It won't be a valid test if you are using CDH but you should try to run all 99 of the real TPC-DS queries in Impala. It can't do it. They cherry pick and modify the queries to work in their database.

…

On Mon, Jan 9, 2017 at 8:42 AM, dimon777 ***@***.***> wrote: Thank you, I will try these. Unfortunately I have no control to use CDH or Horton or CentOS version (v7 seems to be supported: https://cwiki.apache.org/ confluence/display/HAWQ/Build+and+Install <http://url>) . In my tests I have not seen any issues with this configuration. My goal is to test HAWQ on the available Hadoop platform and compare it with Impala. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#7 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFObRO2A3fVBySUjWhBNw6SsqUDPnBIbks5rQkc7gaJpZM4Ld2oP> .

from tpc-ds.

dimon777 commented on July 18, 2024

Yes, I am aware of Cloudera queries adjustment for Impala tpc-ds test. I also read Pivotal article on this. And I fully agree: the way Cloudera did this is unacceptable and misleading for those who rely on TPC-DS benchmark to make the judgment about the platform.

from tpc-ds.

dimon777 commented on July 18, 2024

The issue was in my environment .bashrc didn't have GREENPLUM_PATH variable set. After adding this:
export GREENPLUM_PATH=/usr/local/hawq/greenplum_path.sh
and changing RANDOM_DISTRIBUTION flag to true in ./rollout.sh call test started working fine.

from tpc-ds.

Could not allocate enough memory about tpc-ds HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent