From cd6644ea4ddc78597934ab0ef5ba50e3c3daa927 Mon Sep 17 00:00:00 2001
From: Mitja Felicijan <m@mitjafelicijan.com>
Date: Sat, 8 Jul 2023 23:25:41 +0200
Subject: Moved to a simpler SSG

---
 ...e-case-of-elasticsearch-allocation-failure.html | 64 ++++++++++++++++++++++
 1 file changed, 64 insertions(+)
 create mode 100755 public/the-strange-case-of-elasticsearch-allocation-failure.html

(limited to 'public/the-strange-case-of-elasticsearch-allocation-failure.html')
diff --git a/public/the-strange-case-of-elasticsearch-allocation-failure.html b/public/the-strange-case-of-elasticsearch-allocation-failure.html
new file mode 100755
index 0000000..fdb32a0
--- /dev/null
+++ b/public/the-strange-case-of-elasticsearch-allocation-failure.html
@@ -0,0 +1,64 @@
+<!doctype html><html lang=en-us><meta charset=utf-8><meta name=viewport content="width=device-width,initial-scale=1"><link href="data:image/x-icon;base64,AAABAAEAEBAAAAEAIABoBAAAFgAAACgAAAAQAAAAIAAAAAEAIAAAAAAAAAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAL69vf8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAv76+/8LBwQkAAAAAAAAAAAAAAAC+vb3/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAL+9vf/Bv78JAAAAAAAAAAAAAAAAu7q6/wAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAC7ubr/vr29CAAAAAAAAAAAy8nJAZ6foP8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAnqGj/6GipAoAAAAAHLjU/xcXHf/BwsL/I8XY/yPK3v8XGiD/IbjL/yPF2f8XGiD/Fxkf/yLF2f8gnK3/Fxog/62ztv8fwNf/FRcd/x271v8mz93/GRsi/xkXHf8p097/GiIp/xobIv8p0t3/KdPe/xocIv8fYmr/KNPe/xoZH/8aHCL/J87c/xy81/8VFxz/IsPZ/8zS0/8XGiD/Ir/R/yPH2/8XGiD/Fxkf/yPH2/8dd4T/GBog/yPJ3f8jyNr/uru9/xcUGv8cudb/EhITDKi5vRKlvMP/RUpOERwcHRAdOj4QHTk8EBwdHRAdNTgQHTo/EBwcHRAcHB0QSGduEKW4vf+koqQfHzg+EBqz0ewSFRv7EyMr/xq51vsTERb7ExUb+xq41fsau9j7ExUb+xiPp/sZudb7ExUb+xMVG/sZuNX/GKvI/BIUGfMdvdn/IrfL/xcaIP8n1eb/J9Dh/xkcIf8ZGR7/J8/f/xxCSv8ZGyH/J9Dg/ybQ4P8ZHCL/FSQs/yPK3/8UExj/GE1b/ybS5P8ZGB7/Ghwj/ynW5P8p2Ob/Ghwi/yWrtv8p1eH/Ghwi/xocIv8p1uT/J8XT/xkcIv8m1un/Hb7d/xUYH/8hzOr/HtHu/xcaIf8XGB//I8vi/xgxOv8XGSD/I8rg/yPK4P8XGiD/GUFL/yPP6f8SERj/Fhkh/x3A4f8AAAAAJ2f9/ydr//8mZPH/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAlYu38J2v//ydo/f8AAAAAAAAAAAd8/fkFqf//Iob8sAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMY39awWr//8FfP3/AAAAAAAAAAAFm/7/SfD//wR+/f8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAOB/f9B7v//BaX+/wAAAAAAAAAAQ878SAyZ/v9n1v4KAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADu9v8DDJb+/z3N/XgAAAAA3/sAAN/7AADf+wAA3/sAAAAAAAAAAAAAAAAAAN/7AAAAAAAAAAAAAAAAAAAAAAAAj/EAAI/5AACP8QAA3/sAAA==" rel=icon type=image/x-icon><title>The strange case of Elasticsearch allocation failure</title><meta name=description content="I&amp;#39;ve been using Elasticsearch in production for 5 years now and never had asingle problem with it."><link rel=alternate type=application/rss+xml title="Mitja Felicijan's posts" href=https://mitjafelicijan.com/index.xml><link rel=alternate type=application/rss+xml title="Mitja Felicijan's notes" href=https://mitjafelicijan.com/notes.xml><style>body{padding:1rem;max-width:760px;background:#fff;font-family:times new roman,Times,serif;line-height:1.35rem}hr{margin-block-start:1.5rem}h1,h2,h3{line-height:initial}footer{margin-block-start:3rem}table{max-width:100%;border-collapse:separate;border-spacing:2px;border:1px solid #000;border-left:1px solid #999;border-top:1px solid #999}blockquote{font-style:italic}table thead{background:#eee}td,th{border:1px solid #000;padding:4px;border-right:1px solid #999;border-bottom:1px solid #999;text-align:left}pre{text-wrap:nowrap;overflow-x:auto;margin-block-start:1.5rem;margin-block-end:1.5rem;padding:.5rem 0;border-top:1px solid #000;border-bottom:1px solid #000}pre code{line-height:1.3em}pre,code,pre *,code *{font-family:monospace;font-size:initial!important}img,video,audio{max-width:100%}header{display:flex;flex-direction:row;gap:3rem}nav{display:flex;gap:.75rem}.pstatus-orange{background:gold}.pstatus-green{background:#9acd32}.pstatus-red{background:#cd5c5c}@media only screen and (max-width:600px){header{flex-direction:column;gap:1rem}a{word-wrap:break-word}}</style><header><nav class=main><a href=/>Home</a>
+<a href=https://git.mitjafelicijan.com/ target=_blank>Git</a>
+<a href=https://files.mitjafelicijan.com/ target=_blank>Files</a>
+<a href=/mitjafelicijan.pgp.pub.txt target=_blank>PGP</a>
+<a href=/curriculum-vitae.html>CV</a>
+<a href=/index.xml target=_blank>RSS</a></nav></header><main><div><h1>The strange case of Elasticsearch allocation failure</h1><p>Mar 29, 2020<div><p>I've been using Elasticsearch in production for 5 years now and never had a
+single problem with it. Hell, never even known there could be a problem. Just
+worked. All this time. The first node that I deployed is still being used in
+production, never updated, upgraded, touched in anyway.<p>All this bliss came to an abrupt end this Friday when I got notification that
+Elasticsearch cluster went warm. Well, warm is not that bad right? Wrong!
+Quickly after that I got another email which sent chills down my spine. Cluster
+is now red. RED! Now, shit really hit the fan!<p>I tried googling what could be the problem and after executing allocation
+function noticed that some shards were unassigned and 5 attempts were already
+made (which is BTW to my luck the maximum) and that meant I am basically fucked.
+They also applied that one should wait for cluster to re-balance itself. So, I
+waited. One hour, two hours, several hours. Nothing, still RED.<p>The strangest thing about it all was, that queries were still being fulfilled.
+Data was coming out. On the outside it looked like nothing was wrong but
+everybody that would look at the cluster would know immediately that something
+was very very wrong and we were living on borrowed time here.<blockquote><p><strong>Please, DO NOT do what I did.</strong> Seriously! Please ask someone on official
+forums or if you know an expert please consult him. There could be million of
+reasons and these solution fit my problem. Maybe in your case it would
+disastrous. I had all the data backed up and even if I would fail spectacularly
+I would be able to restore the data. It would be a huge pain and I would loose
+couple of days but I had a plan B.</blockquote><p>Executing allocation and told me what the problem was but no clear solution yet.<pre tabindex=0 style=background-color:#fff><code><span style=display:flex><span>GET /_cat/allocation?format=json
+</span></span></code></pre><p>I got a message that <code>ALLOCATION_FAILED</code> with additional info <code>failed to create shard, failure ioexception[failed to obtain in-memory shard lock]</code>. Well
+splendid! I must also say that our cluster is capable more than enough to handle
+the traffic. Also JVM memory pressure never was an issue. So what happened
+really then?<p>I tried also re-routing failed ones with no success due to AWS restrictions on
+having managed Elasticsearch cluster (they lock some of the functions).<pre tabindex=0 style=background-color:#fff><code><span style=display:flex><span>POST /_cluster/reroute?retry_failed=true
+</span></span></code></pre><p>I got a message that significantly reduced my options.<pre tabindex=0 style=background-color:#fff><code><span style=display:flex><span>{
+</span></span><span style=display:flex><span>  &#34;Message&#34;: <span style=color:#a31515>&#34;Your request: &#39;/_cluster/reroute&#39; is not allowed.&#34;</span>
+</span></span><span style=display:flex><span>}
+</span></span></code></pre><p>After that I went on a hunt again. I won't bother you with all the details
+because hours/days went by until I was finally able to re-index the problematic
+index and hoped for the best. Until that moment even re-indexing was giving me
+errors.<pre tabindex=0 style=background-color:#fff><code><span style=display:flex><span>POST _reindex
+</span></span><span style=display:flex><span>{
+</span></span><span style=display:flex><span>  &#34;source&#34;: {
+</span></span><span style=display:flex><span>    &#34;index&#34;: <span style=color:#a31515>&#34;myindex&#34;</span>
+</span></span><span style=display:flex><span>  },
+</span></span><span style=display:flex><span>  &#34;dest&#34;: {
+</span></span><span style=display:flex><span>    &#34;index&#34;: <span style=color:#a31515>&#34;myindex-new&#34;</span>
+</span></span><span style=display:flex><span>  }
+</span></span><span style=display:flex><span>}
+</span></span></code></pre><p>I needed to do this multiple times to get all the documents re-indexed. Then I
+dropped the original one with the following command.<pre tabindex=0 style=background-color:#fff><code><span style=display:flex><span>DELETE /myindex
+</span></span></code></pre><p>And re-indexed again new one in the original one (well by name only).<pre tabindex=0 style=background-color:#fff><code><span style=display:flex><span>POST _reindex
+</span></span><span style=display:flex><span>{
+</span></span><span style=display:flex><span>  &#34;source&#34;: {
+</span></span><span style=display:flex><span>    &#34;index&#34;: <span style=color:#a31515>&#34;myindex-new&#34;</span>
+</span></span><span style=display:flex><span>  },
+</span></span><span style=display:flex><span>  &#34;dest&#34;: {
+</span></span><span style=display:flex><span>    &#34;index&#34;: <span style=color:#a31515>&#34;myindex&#34;</span>
+</span></span><span style=display:flex><span>  }
+</span></span><span style=display:flex><span>}
+</span></span></code></pre><p>On the surface it looks like all is working but I have a long road in front of
+me to get all the things working again. Cluster now shows that it is in Green
+mode but I am also getting a notification that the cluster has processing status
+which could mean million of things.<p>Godspeed!</div></div></main><footer><hr><div><h3>Want to comment or have something to add?</h3>You can write me an email at
+<a href=mailto:m@mitjafelicijan.com>m@mitjafelicijan.com</a> or catch up
+with me
+<a href=https://telegram.me/mitjafelicijan target=_blank>on Telegram</a>.</div><hr><p>This website does not track you. Content is made available under
+the <a href=https://creativecommons.org/licenses/by/4.0/ target=_blank rel=noreferrer>CC BY 4.0 license</a> unless specified
+otherwise. Blog feed is available as <a href=/index.xml target=_blank>RSS feed</a>.</footer><script src=https://cdn.usefathom.com/script.js data-site=XHQARKXP defer></script>
\ No newline at end of file
-- 
cgit v1.2.3