Wednesday, 2010-09-22

* greg-g waves14:47
davisI am using System V semaphores on a 2.4 kernel.  I am wondering if there is a way to free a semaphore which has been created by a nonexistant program.  (I.e. the process which did semget() crashed.)  I've tried semctl() with a  IPC_RMID but it fails.  semop() with IPC_NOWAIT does not help either.14:50
paroneayeadavis: this isn't the C Compiler channel, sorry!14:50
paroneayeait's the Creative Commons channel :)14:50
davisi thought it was ##C14:51
davissorry my bad14:51
davisbest of luck14:51
JED3hey nkinkade, do we have nagios configured to monitor
nkinkadeJED3: Yeah.17:45 was just down, but i didnt see any nagios messages17:45
nkinkadeJED3: How long was it down for?17:46
nkinkadeDo you get emails for all Nagios alerts?17:46
JED3i'm not sure17:46
JED3nkinkade: yes17:46
JED3at least, I assumed that i do17:47
nkinkadeJED3: I see that Nagios is still configured to query on a8 ... the only reason it's still working is because Varnish is still forwarding requests to the backup machine.17:47
JED3ahh, does nagios follow redirects to check status?17:48
nyerglervarnish doesn't redirect -- it makes a proxy request on your behalf17:49
nyerglertransparent to nagios/users17:49
JED3at any measure, i'm looking around the logs on and i'm not seeing any evidence as to why apache when down17:50
nkinkadeJED3: Varnish doesn't actually redirect, but just fetches the content from a remote backend.  The client has no idea.17:52
nkinkadeOh, I didn't read nyergler reply.  Sorry.17:52
nkinkadeJED3: So absolutely Apache went down on, or you just noticed problems for a minute?17:53
JED3i noticed it was down, saw a varnish error response, and then promptly restarted apache17:54
nkinkadeHmm.  a5 has over 10,000 unique hosts connected to it at this moment.18:37
paroneayeathat's over 9000!18:38
nkinkadeMaybe I missed the reference.18:38
akozakwhat you say18:39
paroneayeankinkade: you missed the reference, but you didn't miss much :)18:39
nkinkadeKeep in mind, that is unique *hosts*, not unique connections.  There are a lot more connections.18:40
nkinkadea5 is consequently dragging a bit.18:40
nkinkadeStill working, but just a bit slowly.18:40
mralexparoneayea: NINE THOOOUUUUSSSAAAAND18:45
mralexJED3: weird, you keep doing something on zupport that nukes the URL path settings of some pages18:48
paroneayeamralex: What nine thousand?18:48
JED3mralex: yes i've had to reset them a couple times now18:53
paroneayeankinkade: man, no kidding, a5 is slow19:32
paroneayeaalmost a bit nervous to restart the engine with the new translations19:33
* paroneayea tests building out on staging first19:34
paroneayeaI'm always nervous to run ./bin/buildout on the live server lest it fail halfway through and make the python environment completely fucked19:34
nkinkadeparoneayea: How long does buildout take?19:36
nkinkadeIs it at all disk intensive?19:36
nkinkadeIf so, you should always run with "sudo ionice -c3 <command>"19:36
*** balleyne has joined #cc19:36
paroneayeankinkade: it's not too disk intensive, but it does take as long as 5-10 minutes19:36
paroneayeabecause it queries a bunch of servers to check for new packages19:37
paroneayeaand sometimes it rebuilds lxml or something.19:37
paroneayeait's faster if you do -N but I need it to fetch new packages19:38
paroneayeaon that note, I didn't actually know about ionice, I just knew about nice19:39
paroneayeaI like how you can also read it as Ion Ice19:39
paroneayeawhich sounds very futurist19:39
paroneayeamralex: win19:41
paroneayeankinkade: if the server is under so much strain atm though, maybe I should avoid wiping the deed cache?19:44
nkinkadeparoneayea: It appears to be a little better now.19:44
nkinkadeparoneayea: All our servers are very sensitive to heavy disk I/O, so any time you do anything that might load it up that will run for more than a minute or two should probably be run using ionice.19:46
nkinkadeFor example, running the following on any of our busier servers would surely cause all services to fail in a short time:19:47
paroneayeaengine back up19:47
nkinkade$ dd if=/dev/zero of=big.bin bs=1G count=2519:47
paroneayeanormal restart time19:47
paroneayeaalways makes me nervous anyway19:47
paroneayeankinkade: that sure is a big.bin19:48
nkinkadeparoneayea: On occasion I have to create one that is 50G, for Varnish.19:48
nkinkadeRunning *that* command without ionice would be about equivalent to pulling the power cable from the server.19:49
akozaki dont understand why the os doesnt stop that from happening19:53
nkinkadeparoneayea: Did you say you were going to hold off on resetting the cache?19:55
nkinkadeHow about just deleting all Greek Deeds?19:55
paroneayeankinkade: I could do that19:55
nkinkadefind all deed.el an delete them.19:55
nkinkadeand of course all files like deed inside of the gr/ dirs19:55
nkinkadeakozak: The OS doesn't know which process is most important unless you tell it so.19:56
akozakbut couldnt there be some way to avoid one process hogging all the resources19:57
paroneayeankinkade: done!19:57
akozakunless you allow it19:57
paroneayeaakozak: you can set default nice levels iirc19:57
nkinkadeakozak: The OS doesn't allow that.19:57
paroneayeamaybe I'm wrong :)19:58
nkinkadeThe OS shares the resources among processes, but some processes are very sensitive to delays in I/O operations.19:58
nkinkadeIf Varnish/Apache can't get requests out the door immediately, then requests start to stack up, the performance goes down and it all snowballs until the machine grinds to a halt.,19:59
akozakright, i guess my question is why the os wouldnt try to mitigate that or something. i also don't really know what im talking about.19:59
akozakah i see19:59
akozakthat makes sense19:59
nkinkadeHence the existence of nice, renice and ionice.19:59
paroneayeaI'm pretty sure I got the IonIce upgrade to my blaster in Metroid at some point.20:01
nkinkadeI'm not expert on all this either, but I mostly understand what is happening a relatively high level.20:01
nkinkadeparoneayea: I see you just deleted Greek deeds, no?20:01
paroneayeankinkade: yeah20:03
paroneayeahence <paroneayea> nkinkade: done!20:03
nkinkadeAwesome.  Thanks!  I look forward to the day when I can log into a convenient web interface and just press a button to launch translations.20:03
paroneayeankinkade: me too :)20:05
nkinkadeAlthough, it is pretty easy to just say "Hey, paroneayea, launch Greek."20:06
nkinkadeMaybe even easier than logging into a web interface and clicking a button.20:06
nkinkadeIt it weren't for the lack of instant gratification, I'd just as soon stick with our present mode of operation.20:07
paroneayeacloudsource your tasks with paroneayea-update20:07
akozaknow thats job security20:07
paulproteusnkinkade: and ionicer :P20:08
* paulproteus waves while updating
nkinkadeOoo.  Yeah, I forgot paulproteus' nice tool that allow regular users to ionice stuff.20:09
nkinkade(as usually on root can do it)20:09
nkinkadepaulproteus: Have you got ionicer into Debian yet?20:10
nkinkadeI say that half jokingly, but actually part serious.20:10
nkinkadeThe ability for a regular user to run ionice seems nice.20:10
paulproteusnkinkade: I feel the same way about it (-:20:10
paulproteusLike the opposite of "Don't tread on me".20:11
nkinkadepaulproteus: What's the meaning of this on the discovered VPS?20:17
nkinkade-rw-r--r-- 1 501 staff 273 Sep  3 20:53 /etc/hosts20:17
* paulproteus blinks20:17
paulproteusIt doesn't look familiar to me, and looks quite icky.20:18
nkinkadesecuretty and securetty.dpkg-old have the same issue.20:18
nkinkadeI guess I'll just fix it up.20:18
paulproteusWell, let me do a "find -uid 501" first20:19
paulproteusIt's just the files you pointed to.20:19
paulproteusHow very odd.20:19
paulproteusMy current hypothesis would be that the machine was pwned by some RedHat-centric attacker (hence user IDs based on 500, not starting at 1000 like Debian).20:20
paulproteusBut I don't even see why they'd do that.20:20
paulproteusSo one alternative is some weird configuration script for some program.20:20
paulproteusSeptember 3... I don't think I was doing much on that VPS that Friday.20:20
nkinkadepaulproteus: Yeah, it's odd.  I only knew about it because dpkg complained when I was upgrading some packages a minute ago.20:20
paroneayea(03:19:46 PM) Christopher Webber: are we doing a tech call this week?20:21
paroneayea(03:20:27 PM) Nathan Yergler: Uh, no. I'm on an airplane in chicago right now :)20:21
nkinkadepaulproteus: It was probably me on Sept 3, updating /etc/hosts20:21
paulproteusHmm, okay.20:21
paulproteusYeah, I'm pretty sure.20:54
nkinkadeOkay.  Good ... just making sure. :-)20:55
nkinkadeThe permissions on those 3 files is a bit disconcerting.20:55
nkinkadeIt's a mystery to me how they could have got that way.20:56
