Existem ferramentas de benchmarking do MySQL? [fechado]

Question

jcho360

Asked: 2012-05-04 12:51:54 +0800 CST2012-05-04 12:51:54 +0800 CST 2012-05-04 12:51:54 +0800 CST

Qual é mais rápido, InnoDB ou MyISAM?

772

Como o MyISAM pode ser "mais rápido" que o InnoDB se

MyISAM precisa fazer leituras de disco para os dados?
O InnoDB usa o buffer pool para índices e dados e o MyISAM apenas para o índice?

4 respostas

Voted

RolandoMySQLDBA · Answer 1 · 2012-05-04T13:32:47+08:00

A única maneira que o MyISAM pode ser mais rápido que o InnoDB estaria nesta circunstância única

MyISAM

Quando lidos, os índices de uma tabela MyISAM podem ser lidos uma vez do arquivo .MYI e carregados no MyISAM Key Cache (conforme dimensionado por key_buffer_size ). Como você pode tornar o .MYD de uma tabela MyISAM mais rápido para ler? Com isso:

ALTER TABLE mytable ROW_FORMAT=Fixed;

Eu escrevi sobre isso nos meus posts anteriores

O melhor de MyISAM e InnoDB (por favor, leia este primeiro)
Qual é o impacto no desempenho de usar CHAR vs VARCHAR em um campo de tamanho fixo? (TROCA #2)
Otimizado my.cnf para servidor high-end e ocupado (sob o título Replication )
Qual DBMS é bom para leituras super-rápidas e uma estrutura de dados simples? (Parágrafo 3)

InnoDB

OK, e o InnoDB? O InnoDB faz qualquer E/S de disco para consultas? Surpreendentemente, sim, ele faz !! Você provavelmente está pensando que eu sou louco por dizer isso, mas é absolutamente verdade, mesmo para consultas SELECT . Neste ponto, você provavelmente está se perguntando "Como no mundo o InnoDB está fazendo E/S de disco para consultas?"

Tudo remonta ao InnoDB ser um mecanismo de armazenamento transacional de reclamação ACID . Para que o InnoDB seja transacional, ele deve suportar o Iin ACID, que é o isolamento. A técnica para manter o isolamento das transações é feita via MVCC, Multiversion Concurrency Control . Em termos simples, o InnoDB registra a aparência dos dados antes que as transações tentem alterá-los. Onde isso fica registrado? No arquivo de tablespace do sistema, mais conhecido como ibdata1. Isso requer E/S de disco .

COMPARAÇÃO

Como tanto o InnoDB quanto o MyISAM fazem E/S de disco, quais fatores aleatórios determinam quem é mais rápido?

Tamanho das Colunas
Formato da coluna
Conjuntos de caracteres
Faixa de valores numéricos (exigindo INTs grandes o suficiente)
Linhas sendo divididas em blocos (encadeamento de linhas)
Fragmentação de dados causada por DELETEseUPDATEs
Tamanho da chave primária (o InnoDB tem um índice clusterizado, exigindo duas pesquisas de chave)
Tamanho das entradas de índice
A lista continua...

Assim, em um ambiente de leitura pesada, é possível que uma tabela MyISAM com um formato de linha fixo supere as leituras do InnoDB no InnoDB Buffer Pool se houver dados suficientes sendo gravados nos logs de undo contidos em ibdata1 para suportar o comportamento transacional imposta aos dados do InnoDB.

CONCLUSÃO

Planeje seus tipos de dados, consultas e mecanismo de armazenamento com muito cuidado. Quando os dados crescem, pode se tornar muito difícil movê-los. Basta perguntar ao Facebook...

Mike Peters · Answer 2 · 2012-05-18T21:51:31+08:00

Mike Peters

2012-05-18T21:51:31+08:002012-05-18T21:51:31+08:00

Em um mundo simples, MyISAM é mais rápido para leituras, InnoDB é mais rápido para gravações.

Assim que você começar a introduzir leitura/gravação mista, o InnoDB também será mais rápido para leituras, graças ao seu mecanismo de bloqueio de linha.

Eu escrevi uma comparação dos mecanismos de armazenamento MySQL há alguns anos, que ainda é válida até hoje, descrevendo as diferenças exclusivas entre MyISAM e InnoDB.

Na minha experiência, você deve usar o InnoDB para tudo, exceto para tabelas de cache de leitura pesada, onde a perda de dados devido à corrupção não é tão crítica.

23

StackG · Answer 3 · 2015-06-12T08:30:00+08:00

Para complementar as respostas aqui abordando as diferenças mecânicas entre os dois motores, apresento um estudo empírico de comparação de velocidade.

Em termos de velocidade pura, nem sempre o MyISAM é mais rápido que o InnoDB, mas na minha experiência ele tende a ser mais rápido para ambientes de trabalho PURE READ por um fator de cerca de 2,0-2,5 vezes. Claramente isso não é apropriado para todos os ambientes - como outros escreveram, MyISAM não possui coisas como transações e chaves estrangeiras.

Fiz um pouco de benchmarking abaixo - usei python para loop e a biblioteca timeit para comparações de tempo. Por interesse, também incluí o mecanismo de memória, que oferece o melhor desempenho geral, embora seja adequado apenas para tabelas menores (você encontra continuamente The table 'tbl' is fullquando excede o limite de memória do MySQL). Os quatro tipos de select que vejo são:

SELECTs de baunilha
conta
SELECTs condicionais
subseleções indexadas e não indexadas

Em primeiro lugar, criei três tabelas usando o seguinte SQL

CREATE TABLE
    data_interrogation.test_table_myisam
    (
        index_col BIGINT NOT NULL AUTO_INCREMENT,
        value1 DOUBLE,
        value2 DOUBLE,
        value3 DOUBLE,
        value4 DOUBLE,
        PRIMARY KEY (index_col)
    )
    ENGINE=MyISAM DEFAULT CHARSET=utf8

com 'MyISAM' substituído por 'InnoDB' e 'memory' na segunda e terceira tabelas.

1) Baunilha seleciona

Consulta:SELECT * FROM tbl WHERE index_col = xx

Resultado: empate

Comparação de seleções de baunilha por diferentes mecanismos de banco de dados

A velocidade destes é praticamente a mesma e, como esperado, é linear no número de colunas a serem selecionadas. O InnoDB parece um pouco mais rápido que o MyISAM, mas isso é realmente marginal.

Código:

import timeit
import MySQLdb
import MySQLdb.cursors
import random
from random import randint

db = MySQLdb.connect(host="...", user="...", passwd="...", db="...", cursorclass=MySQLdb.cursors.DictCursor)
cur = db.cursor()

lengthOfTable = 100000

# Fill up the tables with random data
for x in xrange(lengthOfTable):
    rand1 = random.random()
    rand2 = random.random()
    rand3 = random.random()
    rand4 = random.random()

    insertString = "INSERT INTO test_table_innodb (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
    insertString2 = "INSERT INTO test_table_myisam (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
    insertString3 = "INSERT INTO test_table_memory (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"

    cur.execute(insertString)
    cur.execute(insertString2)
    cur.execute(insertString3)

db.commit()

# Define a function to pull a certain number of records from these tables
def selectRandomRecords(testTable,numberOfRecords):

    for x in xrange(numberOfRecords):
        rand1 = randint(0,lengthOfTable)

        selectString = "SELECT * FROM " + testTable + " WHERE index_col = " + str(rand1)
        cur.execute(selectString)

setupString = "from __main__ import selectRandomRecords"

# Test time taken using timeit
myisam_times = []
innodb_times = []
memory_times = []

for theLength in [3,10,30,100,300,1000,3000,10000]:

    innodb_times.append( timeit.timeit('selectRandomRecords("test_table_innodb",' + str(theLength) + ')', number=100, setup=setupString) )
    myisam_times.append( timeit.timeit('selectRandomRecords("test_table_myisam",' + str(theLength) + ')', number=100, setup=setupString) )
    memory_times.append( timeit.timeit('selectRandomRecords("test_table_memory",' + str(theLength) + ')', number=100, setup=setupString) )

2) Contagens

Consulta:SELECT count(*) FROM tbl

Resultado: MyISAM vence

Comparação de contagens por diferentes mecanismos de banco de dados

Este demonstra uma grande diferença entre MyISAM e InnoDB - MyISAM (e memória) mantém o controle do número de registros na tabela, então esta transação é rápida e O(1). A quantidade de tempo necessária para o InnoDB contar aumenta de forma superlinear com o tamanho da tabela no intervalo que investiguei. Suspeito que muitas das acelerações das consultas MyISAM que são observadas na prática são devido a efeitos semelhantes.

Código:

myisam_times = []
innodb_times = []
memory_times = []

# Define a function to count the records
def countRecords(testTable):

    selectString = "SELECT count(*) FROM " + testTable
    cur.execute(selectString)

setupString = "from __main__ import countRecords"

# Truncate the tables and re-fill with a set amount of data
for theLength in [3,10,30,100,300,1000,3000,10000,30000,100000]:

    truncateString = "TRUNCATE test_table_innodb"
    truncateString2 = "TRUNCATE test_table_myisam"
    truncateString3 = "TRUNCATE test_table_memory"

    cur.execute(truncateString)
    cur.execute(truncateString2)
    cur.execute(truncateString3)

    for x in xrange(theLength):
        rand1 = random.random()
        rand2 = random.random()
        rand3 = random.random()
        rand4 = random.random()

        insertString = "INSERT INTO test_table_innodb (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
        insertString2 = "INSERT INTO test_table_myisam (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
        insertString3 = "INSERT INTO test_table_memory (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"

        cur.execute(insertString)
        cur.execute(insertString2)
        cur.execute(insertString3)

    db.commit()

    # Count and time the query
    innodb_times.append( timeit.timeit('countRecords("test_table_innodb")', number=100, setup=setupString) )
    myisam_times.append( timeit.timeit('countRecords("test_table_myisam")', number=100, setup=setupString) )
    memory_times.append( timeit.timeit('countRecords("test_table_memory")', number=100, setup=setupString) )

3) Seleção condicional

Consulta:SELECT * FROM tbl WHERE value1<0.5 AND value2<0.5 AND value3<0.5 AND value4<0.5

Resultado: MyISAM vence

Comparação de seleções condicionais por diferentes mecanismos de banco de dados

Aqui, o MyISAM e a memória executam aproximadamente o mesmo e superam o InnoDB em cerca de 50% para tabelas maiores. Este é o tipo de consulta para a qual os benefícios do MyISAM parecem ser maximizados.

Código:

myisam_times = []
innodb_times = []
memory_times = []

# Define a function to perform conditional selects
def conditionalSelect(testTable):
    selectString = "SELECT * FROM " + testTable + " WHERE value1 < 0.5 AND value2 < 0.5 AND value3 < 0.5 AND value4 < 0.5"
    cur.execute(selectString)

setupString = "from __main__ import conditionalSelect"

# Truncate the tables and re-fill with a set amount of data
for theLength in [3,10,30,100,300,1000,3000,10000,30000,100000]:

    truncateString = "TRUNCATE test_table_innodb"
    truncateString2 = "TRUNCATE test_table_myisam"
    truncateString3 = "TRUNCATE test_table_memory"

    cur.execute(truncateString)
    cur.execute(truncateString2)
    cur.execute(truncateString3)

    for x in xrange(theLength):
        rand1 = random.random()
        rand2 = random.random()
        rand3 = random.random()
        rand4 = random.random()

        insertString = "INSERT INTO test_table_innodb (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
        insertString2 = "INSERT INTO test_table_myisam (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
        insertString3 = "INSERT INTO test_table_memory (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"

        cur.execute(insertString)
        cur.execute(insertString2)
        cur.execute(insertString3)

    db.commit()

    # Count and time the query
    innodb_times.append( timeit.timeit('conditionalSelect("test_table_innodb")', number=100, setup=setupString) )
    myisam_times.append( timeit.timeit('conditionalSelect("test_table_myisam")', number=100, setup=setupString) )
    memory_times.append( timeit.timeit('conditionalSelect("test_table_memory")', number=100, setup=setupString) )

4) Sub-seleções

Resultado: InnoDB vence

For this query, I created an additional set of tables for the sub-select. Each is simply two columns of BIGINTs, one with a primary key index and one without any index. Due to the large table size, I didn't test the memory engine. The SQL table creation command was

CREATE TABLE
    subselect_myisam
    (
        index_col bigint NOT NULL,
        non_index_col bigint,
        PRIMARY KEY (index_col)
    )
    ENGINE=MyISAM DEFAULT CHARSET=utf8;

where once again, 'MyISAM' is substituted for 'InnoDB' in the second table.

In this query, I leave the size of the selection table at 1000000 and instead vary the size of the sub-selected columns.

Comparação de sub-seleções por diferentes mecanismos de banco de dados

Here the InnoDB wins easily. After we get to a reasonable size table both engines scale linearly with the size of the sub-select. The index speeds up the MyISAM command but interestingly has little effect on the InnoDB speed. subSelect.png

Code:

myisam_times = []
innodb_times = []
myisam_times_2 = []
innodb_times_2 = []

def subSelectRecordsIndexed(testTable,testSubSelect):
    selectString = "SELECT * FROM " + testTable + " WHERE index_col in ( SELECT index_col FROM " + testSubSelect + " )"
    cur.execute(selectString)

setupString = "from __main__ import subSelectRecordsIndexed"

def subSelectRecordsNotIndexed(testTable,testSubSelect):
    selectString = "SELECT * FROM " + testTable + " WHERE index_col in ( SELECT non_index_col FROM " + testSubSelect + " )"
    cur.execute(selectString)

setupString2 = "from __main__ import subSelectRecordsNotIndexed"

# Truncate the old tables, and re-fill with 1000000 records
truncateString = "TRUNCATE test_table_innodb"
truncateString2 = "TRUNCATE test_table_myisam"

cur.execute(truncateString)
cur.execute(truncateString2)

lengthOfTable = 1000000

# Fill up the tables with random data
for x in xrange(lengthOfTable):
    rand1 = random.random()
    rand2 = random.random()
    rand3 = random.random()
    rand4 = random.random()

    insertString = "INSERT INTO test_table_innodb (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
    insertString2 = "INSERT INTO test_table_myisam (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"

    cur.execute(insertString)
    cur.execute(insertString2)

for theLength in [3,10,30,100,300,1000,3000,10000,30000,100000]:

    truncateString = "TRUNCATE subselect_innodb"
    truncateString2 = "TRUNCATE subselect_myisam"

    cur.execute(truncateString)
    cur.execute(truncateString2)

    # For each length, empty the table and re-fill it with random data
    rand_sample = sorted(random.sample(xrange(lengthOfTable), theLength))
    rand_sample_2 = random.sample(xrange(lengthOfTable), theLength)

    for (the_value_1,the_value_2) in zip(rand_sample,rand_sample_2):
        insertString = "INSERT INTO subselect_innodb (index_col,non_index_col) VALUES (" + str(the_value_1) + "," + str(the_value_2) + ")"
        insertString2 = "INSERT INTO subselect_myisam (index_col,non_index_col) VALUES (" + str(the_value_1) + "," + str(the_value_2) + ")"

        cur.execute(insertString)
        cur.execute(insertString2)

    db.commit()

    # Finally, time the queries
    innodb_times.append( timeit.timeit('subSelectRecordsIndexed("test_table_innodb","subselect_innodb")', number=100, setup=setupString) )
    myisam_times.append( timeit.timeit('subSelectRecordsIndexed("test_table_myisam","subselect_myisam")', number=100, setup=setupString) )
        
    innodb_times_2.append( timeit.timeit('subSelectRecordsNotIndexed("test_table_innodb","subselect_innodb")', number=100, setup=setupString2) )
    myisam_times_2.append( timeit.timeit('subSelectRecordsNotIndexed("test_table_myisam","subselect_myisam")', number=100, setup=setupString2) )

Acho que a mensagem para levar para casa de tudo isso é que, se você está realmente preocupado com a velocidade, precisa comparar as consultas que está fazendo, em vez de fazer suposições sobre qual mecanismo será mais adequado.

Rick James · Answer 4 · 2012-05-24T11:36:43+08:00

Rick James

2012-05-24T11:36:43+08:002012-05-24T11:36:43+08:00

O que é mais rápido? Qualquer um pode ser mais rápido. YMMV.

Qual você deve usar? InnoDB -- à prova de falhas, etc, etc.

4

Qual é mais rápido, InnoDB ou MyISAM?

MyISAM

InnoDB

COMPARAÇÃO

CONCLUSÃO

1) Baunilha seleciona

2) Contagens

3) Seleção condicional

4) Sub-seleções

Como ver a lista de bancos de dados no Oracle?

Quão grande deve ser o mysql innodb_buffer_pool_size?

Listar todas as colunas de uma tabela especificada

restaurar a tabela do arquivo .frm e .ibd?

Como usar o sqlplus para se conectar a um banco de dados Oracle localizado em outro host sem modificar meu próprio tnsnames.ora

Como você mysqldump tabela (s) específica (s)?

Como selecionar a primeira linha de cada grupo?

Listar os privilégios do banco de dados usando o psql

Como inserir valores em uma tabela de uma consulta de seleção no PostgreSQL?

Como faço para listar todos os bancos de dados e tabelas usando o psql?

Qual é mais rápido, InnoDB ou MyISAM?

4 respostas

MyISAM

InnoDB

COMPARAÇÃO

CONCLUSÃO

1) Baunilha seleciona

2) Contagens

3) Seleção condicional

4) Sub-seleções

relate perguntas