{"id":2176,"date":"2016-11-22T20:23:56","date_gmt":"2016-11-22T18:23:56","guid":{"rendered":"http:\/\/soci.hu\/blog\/?p=2176"},"modified":"2017-01-17T19:22:04","modified_gmt":"2017-01-17T17:22:04","slug":"nagy-tablak-joinolasa","status":"publish","type":"post","link":"https:\/\/soci.hu\/blog\/index.php\/2016\/11\/22\/nagy-tablak-joinolasa\/","title":{"rendered":"Nagy t\u00e1bl\u00e1k joinol\u00e1sa"},"content":{"rendered":"<p>Egyik foly\u00f3 munk\u00e1mban t\u00f6bb t\u00edzmill\u00f3 soros t\u00e1bl\u00e1kon v\u00e9gzett joinokat kellett optimaliz\u00e1lni. \u00c1ltal\u00e1ban ez nem kih\u00edv\u00e1s, mert szinte mindig vannak sz\u0171r\u00e9si felt\u00e9telek, amelyeket kell\u0151 k\u00f6zels\u00e9gbe v\u00edve a t\u00e1bl\u00e1khoz \u00e9s rendes indexekat al\u00e1pakolva m\u00e1r csak p\u00e1r ezer joint kell v\u00e9grehajtani, ami gyors lesz.<br \/>\nDe most t\u00e9nyleg \u00f6ssze kellett joinolni sok milli\u00f3 sort, sz\u0171r\u00e9s n\u00e9lk\u00fcl.<br \/>\nMit lehet ezzel kezdeni? Sajnos itt m\u00e1r el\u00e9gg\u00e9 behat\u00e1rolt ter\u00fcleten mozgunk. A norm\u00e1l indexel\u0151s megold\u00e1sok nem seg\u00edtenek, mivel minden t\u00e1bl\u00e1t teljes eg\u00e9sz\u00e9ben be kell j\u00e1rni (nincs where).<br \/>\nR\u00e1ad\u00e1sul ha *-os a select, akkor a cover NC index se j\u00e1tszik, hogy legal\u00e1bb az IO cs\u00f6kkenne.<br \/>\nMerge joinra lehet j\u00e1tszani clu indexekkel, de az\u00e9rt ez korl\u00e1tos ter\u00fclet sok t\u00e1bla eset\u00e9n, illetve p\u00e1rhuzamos tervek eset\u00e9n mag\u00e1t\u00f3l nem fog merge joint haszn\u00e1lni (<a href=\"http:\/\/dba.stackexchange.com\/questions\/22217\/sql-server-does-not-optimize-parallel-merge-join-on-two-equivalently-partitioned\">itt<\/a> \u00edrnak egy trace flagr\u0151l, amivel m\u00e9gis r\u00e1 lehet venni).<br \/>\nMit lehet tenni. Egyik lehet\u0151s\u00e9g el\u0151re elk\u00e9sz\u00edteni a join indexelt view-ban. Erre \u00fcgyesen r\u00e1harap az optimizer, ha van olyan join amit azt\u00e1n t\u00f6bbsz\u00f6r futtatunk, akkor meg\u00e9ri ez a denormaliz\u00e1l\u00e1s.<br \/>\nHa viszont van \u00fajabb szerver\u00fcnk (2016), akkor van sokkal durv\u00e1bb lehet\u0151s\u00e9g: Columnstore index.<br \/>\nAz a baj ugye a nagy joinnal, hogy ak\u00e1rhogy is tr\u00fckk\u00f6z\u00fcnk, ez nagy mel\u00f3 a prociknak \u00e9s az IO alrendszernek (vink\u00f3knak). Az indexed view ezt \u00fagy oldja meg, hogy egyszer kell megcsin\u00e1lni, azt\u00e1n sokszor \u00e9lvezni az el\u0151re \u00f6sszepakolt adatokat.<br \/>\nA columnstore viszont (di\u00f3h\u00e9jban) az\u00e9rt piszok gyors mert:<br \/>\n1. 5-10-szeresen t\u00f6m\u00f6r\u00edtve t\u00e1rolja az adatokat, kevesebb IO, illetve a mem\u00f3ri\u00e1ban a buffer cache-t is jobban ki tudja haszn\u00e1lni (mintha t\u00f6bb RAM-unk lenne)<br \/>\n2. K\u00e9pes az adatok csak egy r\u00e9sz\u00e9t felolvasni, ha csak kev\u00e9s oszlop kell (select *-on ez nem seg\u00edt persze)<br \/>\n3. K\u00e9pes batch m\u00f3dban bel\u00fclr\u0151l p\u00e1rhuzamosan v\u00e9grehajtani a m\u0171veletek egy r\u00e9sz\u00e9t (ez nagyon durv\u00e1n megdobja)<br \/>\n4. K\u00e9pes a sorok egy r\u00e9sz\u00e9t felolvasni where felt\u00e9tel alapj\u00e1n, mivel minden 1m sorhoz (szegmens) nyilv\u00e1n tarja az adott oszlop min \u00e9s max \u00e9rt\u00e9k\u00e9t<br \/>\n5. Le tud nyomni oper\u00e1torokat (pl. sum) a storage engine-be, \u00edgy nem kell adatokat passzolgatni a r\u00e9tegek k\u00f6z\u00f6tt.<\/p>\n<p>No, l\u00e1ssuk a medv\u00e9t. L\u00e9trehoztam k\u00e9t m\u00e1solatot egy 100 milli\u00f3 soros t\u00e1bl\u00e1b\u00f3l. A tesztg\u00e9p egy k\u00e9t \u00e9ves laptop 2 core-ral \u00e9s 8G RAM-mal, SSD-vel. Nem egy szerver.<br \/>\nA k\u00e9t t\u00e1bl\u00e1t a kulcsai ment\u00e9s join-olom, \u00edgy mind a 100 milli\u00f3 sort v\u00e9gig kell n\u00e9znie, \u00e9s ennyi tal\u00e1lat is lesz.<\/p>\n<p>El\u0151sz\u00f6r sima Clu index:<br \/>\ncreate clustered index IX_Clu1 on B1(Id)<br \/>\ncreate clustered index IX_Clu2 on B2(Id)<\/p>\n<p>select count(*) from B1 join B2 on B1.Id = B2.Id<\/p>\n<p>SQL Server parse and compile time:<br \/>\n   CPU time = 15 ms, elapsed time = 18 ms.<\/p>\n<p>(1 row(s) affected)<br \/>\nTable &#8216;B1&#8217;. Scan count 5, logical reads 1141262, physical reads 6,<br \/>\nread-ahead reads 1138814, lob logical reads 0, lob physical reads 0,<br \/>\nlob read-ahead reads 0.<br \/>\nTable &#8216;B2&#8217;. Scan count 5, logical reads 1140956, physical reads 4,<br \/>\nread-ahead reads 1138821, lob logical reads 0, lob physical reads 0,<br \/>\nlob read-ahead reads 0.<br \/>\nTable &#8216;Workfile&#8217;. Scan count 896, logical reads 480256, physical reads<br \/>\n2688, read-ahead reads 477568, lob logical reads 0, lob physical reads<br \/>\n0, lob read-ahead reads 0.<br \/>\nTable &#8216;Worktable&#8217;. Scan count 0, logical reads 0, physical reads 0,<br \/>\nread-ahead reads 0, lob logical reads 0, lob physical reads 0, lob<br \/>\nread-ahead reads 0.<\/p>\n<p> SQL Server Execution Times:<br \/>\n   CPU time = 477262 ms,  elapsed time = 377318 ms.<br \/>\nSQL Server parse and compile time:<br \/>\n   CPU time = 0 ms, elapsed time = 1 ms.<\/p>\n<p> SQL Server Execution Times:<br \/>\n   CPU time = 0 ms,  elapsed time = 0 ms.<\/p>\n<p>377 m\u00e1sodperc.<\/p>\n<p>J\u00f6het a columnstore index:<br \/>\ncreate clustered columnstore index IX_CStore1 on B1<br \/>\ncreate clustered columnstore index IX_CStore2 on B2<\/p>\n<p>select count(*) from B1 join B2 on B1.Id = B2.Id<\/p>\n<p>SQL Server parse and compile time:<br \/>\n   CPU time = 0 ms, elapsed time = 6 ms.<\/p>\n<p>(1 row(s) affected)<br \/>\nTable &#8216;B2&#8217;. Scan count 4, logical reads 0, physical reads 0,<br \/>\nread-ahead reads 0, lob logical reads 105018, lob physical reads 0,<br \/>\nlob read-ahead reads 0.<br \/>\nTable &#8216;B2&#8217;. Segment reads 103, segment skipped 0.<br \/>\nTable &#8216;B1&#8217;. Scan count 4, logical reads 0, physical reads 0,<br \/>\nread-ahead reads 0, lob logical reads 104998, lob physical reads 0,<br \/>\nlob read-ahead reads 0.<br \/>\nTable &#8216;B1&#8217;. Segment reads 102, segment skipped 0.<br \/>\nTable &#8216;Worktable&#8217;. Scan count 0, logical reads 0, physical reads 0,<br \/>\nread-ahead reads 0, lob logical reads 0, lob physical reads 0, lob<br \/>\nread-ahead reads 0.<\/p>\n<p> SQL Server Execution Times:<br \/>\n   CPU time = 79920 ms,  elapsed time = 27834 ms.<br \/>\nSQL Server parse and compile time:<br \/>\n   CPU time = 0 ms, elapsed time = 0 ms.<\/p>\n<p> SQL Server Execution Times:<br \/>\n   CPU time = 0 ms,  elapsed time = 0 ms.<\/p>\n<p>377 sec vs. 28 sec. Az\u00e9rt ez massz\u00edv k\u00fcl\u00f6nbs\u00e9g. :)<\/p>\n<p>\u00c9rdekess\u00e9gk\u00e9ppen megn\u00e9ztem NC Columnstore index-szel is, \u00fagy 60 sec j\u00f6n ki. Ez se rossz.<\/p>\n<p>A j\u00f6v\u0151 h\u00e9ten lehet ki tudjuk pr\u00f3b\u00e1lni egy nagyobb g\u00e9ppel is, k\u00edv\u00e1ncsi vagyok, ott mit tudunk vele kihozni.<\/p>\n<p>Ha esetleg valakinek vannak m\u00e1r gyakorlati sikerei, \u00e9rdekelnek a sz\u00e1mok.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Egyik foly\u00f3 munk\u00e1mban t\u00f6bb t\u00edzmill\u00f3 soros t\u00e1bl\u00e1kon v\u00e9gzett joinokat kellett optimaliz\u00e1lni. \u00c1ltal\u00e1ban ez nem kih\u00edv\u00e1s, mert szinte mindig vannak sz\u0171r\u00e9si felt\u00e9telek, amelyeket&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6,4,30,78,87,95],"tags":[],"class_list":["post-2176","post","type-post","status-publish","format-standard","hentry","category-adatbazisok","category-szakmai-elet","category-sql-server","category-sql-server-2012","category-sql-server-2014","category-sql-server-2016"],"_links":{"self":[{"href":"https:\/\/soci.hu\/blog\/index.php\/wp-json\/wp\/v2\/posts\/2176","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/soci.hu\/blog\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/soci.hu\/blog\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/soci.hu\/blog\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/soci.hu\/blog\/index.php\/wp-json\/wp\/v2\/comments?post=2176"}],"version-history":[{"count":4,"href":"https:\/\/soci.hu\/blog\/index.php\/wp-json\/wp\/v2\/posts\/2176\/revisions"}],"predecessor-version":[{"id":2181,"href":"https:\/\/soci.hu\/blog\/index.php\/wp-json\/wp\/v2\/posts\/2176\/revisions\/2181"}],"wp:attachment":[{"href":"https:\/\/soci.hu\/blog\/index.php\/wp-json\/wp\/v2\/media?parent=2176"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/soci.hu\/blog\/index.php\/wp-json\/wp\/v2\/categories?post=2176"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/soci.hu\/blog\/index.php\/wp-json\/wp\/v2\/tags?post=2176"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}