Friday, 2009-01-09

nkinkadepaulproteus: Can you log into a5 for a minute?00:02
nkinkade4 eyes are better than 2.00:02
nkinkadeApache is acting up, or is being affected by something else ...00:02
nkinkadeNothing has gone down yet, but it may.00:02
paulproteusnkinkade, Conveniently I have four eyes already!00:03
* paulproteus sshs to a500:03
nkinkadepaulproteus: pid 2420400:03
nkinkadeI'm already stracing it, but nothing overly interesting: rt_sigtimedwait(~[ILL TRAP ABRT BUS FPE KILL SEGV USR2 PIPE CONT STOP SYS RTMIN RT_1],00:04
nkinkadeI don't know what it's waiting on.00:04
paulproteusCan you ^C your strace for me?00:04
nkinkadepaulproteus: Done.00:05
nkinkadeswap is growing and the machine could become unresponsive.00:05
nkinkadeI've already got the first Nagios message that Web Services is down, so Apache is hit pretty hard.00:05
nkinkadeIt can't even return a 20 or 30 byte static HTML file.00:06
paulproteus#0  0x00002b1b0345e1c3 in do_sigwait () from /lib/
paulproteus#1  0x00002b1b0345e25d in sigwait () from /lib/
paulproteus#2  0x00002b1b03007a5b in apr_signal_thread () from /usr/lib/
paulproteus#3  0x00002b1b074ae70c in ?? () from /usr/lib/apache2/modules/mod_wsgi.so00:06
paulproteusIt's stuck in some mod_wsgi threading mess.00:06
paulproteusSounds no good to me; we could kill it abruptly.00:06
paulproteusShould only interrupt one request at most.00:06
paulproteusShall I?00:06
paulproteusIf other methods don't work, gdb'ing in and doing "call exit(0)" should do the trick.00:07
nkinkadepaulproteus: Please do kill it.00:07
paulproteusEr, now it's not quite responding to my touch.00:08
* paulproteus keeps waiting00:08
nkinkadeFor me either.00:08
nkinkadeI was afraid it might spiral downward.00:08
nkinkadepaulproteus: Did you use gdb to get that info above?00:08
paulproteus"sudo gdb -p $PID", then ^C to get a gdb prompt, then "bt" to get a backtrace.00:09
nkinkadeThat's good information.00:09
nkinkadeAt least we now have some minimal idea of what might be causing it.00:09
paulproteusI emailed about this a bit ago in a discussion of core dumps I think.00:09
nkinkadeI may have to Rapid Reboot the thing.00:10
paulproteusOne sec.00:10
nkinkadeOr not.00:10
paulproteus(gdb) call exit(0)00:10
paulproteusAnd now we let the OS clean that sucker up.00:10
paulproteus...which is taking a while.00:11
paulproteusSweet! done.00:11
paulproteusThat's my favorite trick in the whole book.00:11
nkinkadecall exit(0)?00:11
paulproteusIt's the Jedi Mind Trick for processes that don't want to exit.00:11
nkinkadeI've learned something new.00:11
* paulproteus waves hands and flashes the debugger00:12
paulproteus"You want to exit successfully."00:12
paulproteus(gdb) call exit(0)00:12
paulproteusProgram exited normally.00:12
paulproteusThe program being debugged stopped while in a function called from GDB.00:12
paulproteusIt's a pretty evil trick.00:12
nkinkadeIt must be the metadata_scraper causing that.00:12
paulproteusIn theory I should maybe call _exit(0) to be even more evil.00:13
paulproteusDid you look at whose child it was before we killed it?00:13
paulproteusI guess we're now guessing.00:13
nkinkadeI don't of anything else that uses mod_wsgi00:13
paulproteusOkay, that's good to know.00:13
nkinkadeAt least I don't think anything else uses it.00:13
nkinkadenathany: Is that correct as far as you know?  That the only thing on a5 using mod_wsgi is the metadata_scraper?00:14
paulproteus<paulproteus> #2  0x00002b1b03007a5b in apr_signal_thread () from /usr/lib/
paulproteusI wonder if that's the Apache Portable Runtime trying to respond to your signals.00:14
paulproteusLike your kill signals.00:14
nkinkadepaulproteus: Which kill signals?00:15
paulproteusDidn't you do kill -SOMETHING $PID?00:15
nkinkadepaulproteus: Nope.00:15
paulproteusOh, okay.00:15
paulproteusSomehow I thought you had.00:15
paulproteusMy way is a little more abrupt but much more guaranteed to work.00:15
nkinkadepaulproteus: However, it is possible that the watchdog tried to restart Apache and/or Varnish.00:15
paulproteusI seem to be mistaken about the purpose of that.00:15
paulproteusIt would be nice to have debugging symbols for these libraries so these stack traces are a little more useful.00:16
nkinkadeI only logged in, saw the process that was seemingly hung, and then straced it, to no avail.00:16
* paulproteus installs libapr1-dbg for the APR symbols at least00:16
nkinkadeIt was only coincidence that I logged in, and I noticed it was slugging in giving me a prompt so I top'd the thing.00:16
paulproteusAnd I never passed it a signal. I think that was a red herring.00:17
nathanynkinkade: yes, AFAIK that's all on a500:17
nathanyyou can look at the apache conf files to be certain00:17
nathany(sorry for the delay, i was away from my desk and then reading the backlog to figure out what was going on)00:17
nkinkadenathany: It's possible that something with it or the scraper is causing these recent issues on a5.00:17
paulproteusnkinkade, You said it had huge memory usage?00:18
paulproteusI did no forensics other than the stacktrace and the death.00:18
nkinkadepaulproteus: Yeah, it seemed to have most all of the available physical memory and was slowing eating up swap.00:18
nathanyoh, damn; i wonder if i ever implemented the guard to keep the scraper from endlessly scraping the entire internet00:18
nkinkadeWe're all about the metadata.00:19
paulproteusmeat data00:19
nkinkadeAnd by god, we'll scrape the whole 'net if that's what it takes.00:19
nkinkadeI would be super happy if it were something like that.00:19
nkinkadeBecause a5 has been running really nicely since the RAM upgrade (and a7 too).00:20
nkinkadeUntil the past week or so, that is.00:20
nkinkadenathany: Were you serious when you said that, or was it a joke?00:23
nathanynkinkade: unfortunately serious00:23
nathanyand yeah, just looked and it's possible to construct a scraper honeypot00:23
nathanythat would keep it busy busy busy00:23
nkinkadenathany: Do you think you'll have time to fix that soon?00:24
nkinkadeIt seems pretty urgent.00:24
nathanyi'm opening a bug now; need to go home and walk the dog but i can look at it this evening00:24
nathanyshould be a straight-forward fix00:24
nkinkadeIf that's even the real problem ...00:24
nkinkadeHopefully that could be it.00:24
nathanyfeel free to nosy yourself if you want to follow along at home00:25
paulproteusnkinkade, Well, crisis averted.00:26
paulproteusThe thing is, though, the mod_wsgi part of Apache isn't actually the scraper.00:26
paulproteusThat's a totally separate Python process.00:26
paulproteusYou'd think the Python process would have been the one to blow up and use all the RAM.00:27
paulproteusDid you check if the Apache2 process, or instead some Python process, was using up the memory?00:27
paulproteusAlso, when I run top on a5, I see all the processes.00:28
nkinkadepaulproteus: I'm not actually sure.  I usually shift-m top to sort by memory, but I didn't do that this time00:28
paulproteusThat feels reallys trange.00:28
nkinkadeWhat is your screen resolution?00:28
* paulproteus nods00:28
paulproteusBut there are a few blank lines at the bottom.00:28
nkinkadeThere are only 90 processes, so that seems reasonable.00:29
paulproteusYeah, I just find that surprisingly few.00:29
nkinkadepaulproteus: Thanks for lending me a hand.  The gdb trick is a good one for me to know.00:29
paulproteusIt's pure evil.00:29
nkinkadeThe backtrace seems to have been useful, the call exit(0) I couldn't say.00:30
nkinkadeBut a5 doesn't run very much ... it's more or less dedicated to and ns100:31
nkinkadeOmg, why is portmap running?00:31
paulproteusTime for sudo apt-get remove nfs-common.00:32
* paulproteus beats you to it00:32
nkinkadepurge? :-)00:32
nkinkadeHow in heck did nfs-common get in there?00:32
paulproteusEh, just remove (and I also removed portmap itself).00:32
paulproteusI think it's part of the Debian default.00:32
paulproteusSince it's somewhat non-obvious that without it, you can't use NFS.00:33
nkinkadeI almost always go through and remove nfs-commons and I sure I had done it for a5 and all the others.00:33
paulproteusYou might like the package cssh.00:33
nkinkadeMaybe it went in as dep to some othe rpackage.00:33
paulproteusNope, I don't think so.00:34
nkinkadeNFS is in the default install but not less!00:34
paulproteusBecause it's obvious what to do when less says command not found.00:34
paulproteusIt's less obvious what to do when you try mount a remote filesystem and mount hangs.00:34
nkinkadeBut do you think NFS is so utilized by everyone that it must be in the default install?00:35
nkinkadeI haven't used NFS in a long time.00:35
*** stevel_ has joined #cc00:35
nkinkadeThough I know you have.00:35
paulproteusBut still, if mount hangs, that's confusing.00:35
nkinkadeThat's reasonable.00:35
paulproteusIt doesn't give an error message like, "WTF install portmap."00:36
paulproteusWhich it probably should.00:36
nkinkadepaulproteus: How insecure do you think it would be to set ionice setuid?00:38
nkinkadeI imagine it's a fairly simple program, so it seems like maybe it wouldn't be too bad, but maybe I'm missing something.00:39
paulproteusIt can fork a shell.00:39
paulproteusSetuid ionice is equivalent to setuid bash.00:39
paulproteusWhy do you want that?00:39
paulproteusTo avoid installing ionicer?00:40
paulproteusThe more sensible thing to do apparently is to adjust the security capability preferences, says a friend of mine, so any program can adjust any other program's IO niceness.00:40
paulproteusThat's still not so great, which is why I still think ionicer is nicer.00:40
nkinkadeI would like to ionice the mysqldumps and the subsequent tarring of them, but I don't want to have to divert mysqldump and tar.00:41
paulproteusIsn't that done by cron?00:41
nkinkadeThis one is done remotely via SSH.00:41
paulproteusOh, that's even easier in a way.00:41
paulproteusIs it done by SSH as root?00:42
nkinkadeAs everett.00:42
paulproteusYou could make everett's shell /usr/bin/ionice!00:42
paulproteusEr, wait, same problem.00:42
paulproteusIn his .bashrc, add ionicer.00:42
nkinkadeWhat I like about the current way of doing the mysql backups is that it doesn't require any config on the remote machines.00:42
paulproteusOh, right.00:42
nkinkadeJust one script on the backup machine and ta-da.00:43
paulproteusExcept the creation of the user.00:43
paulproteusAnd the addition of the SSH key presumably.00:43
nkinkadeYeah, one thing.00:43
paulproteusBut I agree, that's not much configuration.00:43
nkinkadeOkay, two things.00:43
nkinkadeMaybe I'll try your suggestion of adding ionicer to .bashrc.00:44
paulproteusStill, I see what you mean.00:44
nkinkadeHe doesn't do anything that I wouldn't want to be ioniced.00:44
nkinkadeWhat I'm confused about is why these machines are so sensitive like this.00:45
paulproteusWe could modify ionicer so that it did fork a process, I would just have to be more careful about its security.00:45
nkinkade(to disk I/O)00:45
nkinkadeYou mean like:00:45
nkinkadeionicer somecommand --some-option00:46
paulproteusWhich is how ionice works in the first place.00:47
paulproteusIt'd be nice (!) to make a /bin/nicebash shell.00:47
paulproteusBut that's somewhat stupidly special-purpose.00:47
nkinkadeRight, and this is why I thought about setuid'ing ionice, but clearly I didn't think it through!00:48
paulproteusI'm going to think about other things, like this BayPiggies talk I'm giving soon.00:48
paulproteusAlso I do like the window manager called awesome so far.00:49
*** nathany has quit IRC00:51
paulproteusCan't get there from here!00:51
*** stevel has quit IRC00:52
*** TimStarling has joined #cc01:10
*** stevel_ has quit IRC01:25
*** tanjir has quit IRC01:25
*** tanjir has joined #cc01:26
*** Bovinity has quit IRC01:34
*** iamiamiam has joined #cc01:35
iamiamiami need help!!!!!01:37
*** iamiamiam has left #cc01:38
greg-gcall 911?01:39
*** mlinksva has quit IRC02:03
*** everton137 has joined #cc02:51
*** everton137 has quit IRC03:06
*** nkinkade has left #cc03:09
*** Danny_B has quit IRC05:11
*** mlinksva has joined #cc05:17
*** Danny_B has joined #cc05:18
*** tanjir has quit IRC06:18
*** jgay has joined #cc06:40
*** mlinksva has quit IRC06:56
*** TimStarling has left #cc08:38
*** UncleCJ2_ has quit IRC09:16
*** UncleCJ2_ has joined #cc09:17
*** grahl has joined #cc11:49
*** jgay has quit IRC13:26
*** zer0 has joined #cc13:36
zer0can i get cc here?//////13:39
*** zer0 has quit IRC13:45
*** Ekushey has joined #cc14:31
*** mlinksva has joined #cc14:55
*** grahl has quit IRC15:17
*** UncleCJ2_ has quit IRC15:25
*** jgay has joined #cc15:37
*** nkinkade has joined #cc15:44
*** Ekushey- has joined #cc16:09
*** Ekushey has quit IRC16:21
*** Ekushey- has quit IRC16:46
*** nathany has joined #cc16:52
*** stevel has joined #cc16:54
*** m3cr3d1s has joined #cc17:00
m3cr3d1sYou'll all be happy to know I added a white background to the CC icon on twitter17:07
*** cacimar has joined #cc17:32
*** cacimar has quit IRC17:33
*** cacimar has joined #cc17:43
*** Bovinity has joined #cc17:48
*** mecredis_ has joined #cc18:01
*** m3cr3d1s has quit IRC18:03
*** stevel has quit IRC18:04
*** stevel has joined #cc18:04
*** balor has joined #cc18:49
paulproteusmecredis, Is a white background better than a transparent one?19:15
*** Prog66 has joined #cc19:26
*** Prog66 has left #cc19:26
*** m3cr3d1s has joined #cc19:29
*** rohitj has quit IRC19:30
*** mecredis_ has quit IRC19:34
*** mecredis_ has joined #cc19:52
*** m3cr3d1s has quit IRC19:53
Bovinityoh, my, god,
nkinkadeThat's awesome!20:49
nkinkadeI guess Cobert will be very upset, and maybe even litigious!20:49
nkinkade(not likely)20:49
*** cacimar has quit IRC21:07
paulproteusnathany, So you recorded the video separately from the audio, and then mux'd them later. Do you recall with what?21:14
paulproteus(And it seems I'll use gtk-recordmydesktop for the video recording.)21:15
nathanypaulproteus: gtk-recordmydesktop sounds familiar21:16
nathanyi used a command line tool of your suggestion21:16
paulproteusOh, right, I asked you this before.21:16
* paulproteus chuckles and sighs.21:16
nkinkadeBreasts and domains names, a really sensible combination:
nathanyW. T. F.? i cam in at the wrong time21:51
Bovinityoh boy...21:51
nathanyseems like a ridiculous combination to *ME* ;)21:52
*** pktck has joined #cc22:03
*** balor has quit IRC22:07
*** pktck has quit IRC22:11
*** pktck has joined #cc22:11
*** pktck_ has joined #cc22:21
*** pktck__ has joined #cc22:26
*** pktck_ has quit IRC22:26
*** pktck has quit IRC22:30
nkinkadenathany: I see you just committed some changes to the scraper.  Does that mean that the potential issue with the scraper should be fixed?22:40
nathanyi need to test + update, but possible22:40
nkinkadeJust want to know where things stand in case any problems crop up with a5 again.22:40
*** pktck__ has quit IRC23:06
*** pktck has joined #cc23:06
*** nathany has quit IRC23:15
*** tim_hwang has joined #cc23:28
*** pktck has quit IRC23:28
*** pktck has joined #cc23:28
Bovinitynkinkade: ping23:29
nkinkadeBovinity: Hi.23:31
Bovinitynkinkade: how do i add an email address to the photocopier address book?23:32
Bovinityspecifically, mine23:32
nkinkadeBovinity: One second.23:33
*** rohitj has joined #cc23:37
nkinkadeBovinity: Did you figure it out?23:52
nkinkadeYou have to press the login button then enter a really secret password: 1111123:53
Bovinitylogin button?23:53
nkinkadeAt that point you have System and User settings.  I can't remember where the address book is.23:53
nkinkadeThere is a button at the top right of the LCD display.23:54
nkinkadeI don't know what it says, but it should be something that is semi-indicative of logging in or admin'ing the thing.23:54
*** mecredis_ has quit IRC23:55
nkinkadeOnce you get into the right settings area, there is an address book option.23:55
Bovinityyeah, got it ;)23:56
Bovinityi didn't notice the login button on the panel23:56
nkinkadeBovinity: Did you get your address entered okay?23:58
Bovinitynkinkade: yep, thx23:58

Generated by 2.6 by Marius Gedminas - find it at!