HOME| SERVICES | CONTACTS


#DBA #MySQL #Databases




Published on: March 15, 2018

MariaDB parameters matter!

While performing the technical audit for a new Client (eCommerce website, Bitrix CMS) we quickly noticed that the average load time for their customers facing website was slow:

Top page load speed (initial readings).
attempt #1 (11.5 sec)

attempt #2 (12.9 sec)

attempt #3 (12.1 sec)

Products group page load speed (initial readings).
attempt #1 (10.7 sec)

attempt #2 (13.2 sec)

attempt #3 (12.5 sec)


Obviously those readings were beyond the acceptance levels for a website with the main intention to sell products (and to keep aggressively increasing the sales volumes). Therefore, this issue needed the immediate attention!

When this performance issue was brought up with the Client, they confirmed they're aware of it as it had been there for a long while. The Client's was pushing for the migration to a new more powerful but more expensive hardware to improve the load time. But the Roki Team were reluctant to agree with the migration plans until the proper investigation is conducted to confirm whether the hardware is really outdated or doesn't keep up with a users volume.

A better hardware would almost surely improve the figures, however, spending more money instead of addressing the route cause is not what we like doing in Roktech. Instead, the request to perform the deeper technical investigation was issued, and the Client approved such request.


Initial investigation

The starting point for the investigation was to look at the site utilization stats and to correlate those figures with the capabilities of the hosting there the front-end system was placed.

Looking at the Google Analytics statistics and grepping the server's logs quickly gave us the picture of the website utilization in terms of users/connections:



Then the actual hosting hardware was inspected. The investigated site was hosted on a dedicated server (with no other sites being hosted there nor resources sharing). The server itself turned out to be a very decent powerful machine:

[[email protected] ~]$ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                16
On-line CPU(s) list:   0-15
Thread(s) per core:    2
Core(s) per socket:    4
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 44
Model name:            Intel(R) Xeon(R) CPU           E5620  @ 2.40GHz
Stepping:              2
CPU MHz:               1600.000
CPU max MHz:           2401,0000
CPU min MHz:           1600,0000
BogoMIPS:              4800.21
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              12288K
NUMA node0 CPU(s):     0-3,8-11
NUMA node1 CPU(s):     4-7,12-15

[[email protected] ~]$ cat /proc/meminfo
MemTotal:       16256332 kB
MemFree:         1340484 kB
MemAvailable:    9667944 kB
Buffers:          987212 kB
Cached:          3658216 kB
[[email protected] ~]$ cat /proc/meminfo
MemTotal:       16256332 kB
MemFree:         1340484 kB
MemAvailable:    9667944 kB
Buffers:          987212 kB
Cached:          3658216 kB
SwapCached:        76976 kB
Active:          7222160 kB
Inactive:        3298800 kB
Active(anon):    4361932 kB
Inactive(anon):  1553304 kB
Active(file):    2860228 kB
Inactive(file):  1745496 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:      16777212 kB
SwapFree:       15782552 kB

And a number of various hardware test ruled out any HW defects/issues with the server.


By having these figures collected and by looking at the performance of similar systems (using information available via public sources) the Team drew a conclusion that under the current load the existing hardware should perform 3-5 times faster, i.e. allowing to load the pages within a timeframe of 2-4 seconds.

At this point the migration to a new hardware was not the option anymore as it became obvious that the hardware was sufficient and was unlikely causing any bottlenecks.


What was it then?

The next step was to have a closer look at the actual page load process and its nature. Bitrix CMS has the built-in debugger which was very handy for this purpose. A few refreshes of the top page with the enabled debugger revelaed that the significant portion (10-11 secs) of the total load time was generated by execution of SQL queries.

This fact alone eliminated many other possible scenarios and set the fundamental vector for further investigation - we needed to look at the DB side of the system, i.e. queries, MariaDB setting etc.

And it wasn't long before we identified the type of the queries taking that long to execute:

SELECT DISTINCT BS.*, B.LIST_PAGE_URL, B.SECTION_PAGE_URL, B.IBLOCK_TYPE_ID, B.CODE as IBLOCK_CODE, B.XML_ID as IBLOCK_EXTERNAL_ID, BS.XML_ID as EXTERNAL_ID, DATE_FORMAT(BS.TIMESTAMP_X, '%d.%m.%Y %H:%i:%s') as TIMESTAMP_X, DATE_FORMAT(BS.DATE_CREATE, '%d.%m.%Y %H:%i:%s') as DATE_CREATE ,COUNT(DISTINCT BE.ID) as ELEMENT_CNT
FROM
b_iblock_section BS
INNER JOIN b_iblock B ON BS.IBLOCK_ID = B.ID
INNER JOIN b_iblock_section BSTEMP ON BSTEMP.IBLOCK_ID = BS.IBLOCK_ID
LEFT JOIN b_iblock_section_element BSE ON BSE.IBLOCK_SECTION_ID=BSTEMP.ID
LEFT JOIN b_iblock_element BE ON (BSE.IBLOCK_ELEMENT_ID=BE.ID AND ((BE.WF_STATUS_ID=1 AND BE.WF_PARENT_ELEMENT_ID IS NULL ) AND BE.IBLOCK_ID = BS.IBLOCK_ID ) AND BE.ACTIVE='Y' AND (BE.ACTIVE_TO >= now() OR BE.ACTIVE_TO IS NULL) AND (BE.ACTIVE_FROM <= now() OR BE.ACTIVE_FROM IS NULL))
WHERE
1=1 AND BSTEMP.IBLOCK_ID = BS.IBLOCK_ID AND BSTEMP.LEFT_MARGIN >= BS.LEFT_MARGIN AND BSTEMP.RIGHT_MARGIN <= BS.RIGHT_MARGIN AND BSTEMP.GLOBAL_ACTIVE = 'Y' AND ((((BS.ACTIVE='Y')))) AND ((((BS.GLOBAL_ACTIVE='Y')))) AND ((((BS.IBLOCK_ID = '24')))) AND (BS.LEFT_MARGIN >=285) AND (BS.RIGHT_MARGIN <=287) AND ((((BS.DEPTH_LEVEL <= '7')))) AND ((((B.ID = '24'))))
GROUP BY
BS.ID, B.ID
ORDER BY
BS.LEFT_MARGIN asc 

In a short while we were already running those SQLs in the Staging environment (created as a very close copy of the Production environment) just to confirm a similarly long execution figures:


So that was the problem. But why? What was the root cause?


Root cause

Before diving in deep waters of DB design and queries structures, we decided to have a quick Google shortcut/search to see if anyone had ever documented any issues with similar DB queries against MySQL/MariaDB. And... after checking a number of no-use forum threads the luck was finally on our side!

The guy nailed it down in this stackexchange discussion.

it's a known bug they have with MariaDB, just had to comment out the config about join_buffer_size, apparently anything higher than the default value somehow causes problem to that specific query.

To make it explicitly clear, here we're talking about the join_buffer_sizei parameter located inside the my.cnf configuration file for the MariaDB. And then it was not a surprise for us to find out that the value of this parameter was way higher than the default value (128K) in our case:

[[email protected] etc]# grep join_buffer_size my.cnf
join_buffer_size = 64M
[[email protected] etc]#

Commenting it out was the fix which went to the Staging first, then further deployet to the Production after the performance improvement had been verified:


Top page load speed ( initial vs. post-fix readings).
attempt #1 (11.5 sec)
 

attempt #2 (12.9 sec)
 

attempt #3 (12.1 sec)
 


Conclusion

Not optimal configuration of MySQL/MariaDB has been known for causing performance issues. In this example changing one value in the configuration file helped to significantly speed up the load time of the poorly performing pages, and, as a result, helped to avoid of the risky and costly migration in the immediate term.

There were still a number of things in that system to address from the performance point of view to keep the load time constantly under 1.5 seconds, and we'll mention them in our future articles.




MORE STORIES





NEED HELP?

If you need help with your Company's IT challenges - we are that Team!



1. Send us an inquiry.
2. Shortly we will contact you to discuss your project and begin to prepare a proposal.
2. Once agreement has been made, our Team will start to deliver your project to your requirements and satisfaction.





CONTACT US

PLEASE GET IN TOUCH WITH US!


EMAIL: [email protected]

TEL (London, UK): +44 20 3807 7541
TEL (New York, USA): +1 347 620 7187
TEL (Manila, Philippines): +63 2 231 2519