Re: Best architecture for proxy?

On Wed, 11 Jul 2007 07:00:18 -0700, Andrew Warkentin wrote:
>On Jul 10, 8:19 pm, Steve Holden wrote:
>> Bjoern Schliessmann wrote:
>> > Andrew Warkentin wrote:
>> >> I am going to write a general-purpose modular proxy in Python. It
>> >> will consist of a simple core and several modules for things like
>> >> filtering and caching. I am not sure whether it is better to use
>> >> multithreading, or to use an event-driven networking library like
>> >> Twisted or Medusa/ Asyncore. Which would be the better
>> >> architecture to use
>> > I'd definitely use an event-driven approach with Twisted.
>> > Generally, multithreading is less performant than multiplexing. High
>> > performance servers mostly use a combination of both, though.
>> Converselt I'd recommend Medusa - not necessarily because it's "better",
>> but becuase I know it better. There's also a nice general-purpose proxy
>> program (though I'd be surprised if Twisted didn't also have one).
>Would an event-driven proxy be able to handle multiple connections
>with large numbers of possibly CPU-bound filters I use The
>Proxomitron (and would like to write my own proxy that can use the
>same filter sets, but follows the Unix philosophy) and some of the
>filters appear to be CPU-bound, because they cause The Proxomitron to
>hog the CPU (although that might just be a Proxomitron design flaw or
>something). Wouldn't CPU-bound filters only allow one connection to be
>filtered at a time On the Medusa site, it said that an event-driven
>architecture only works for I/O-bound programs.

Handling all of your network traffic with a single OS thread doesn't
necessarily mean that all of your filters need to run in the same
thread (or even in the same process, or on the same computer).

Typically, however, a filtering rule should only need to operate on a
small number of bytes (almost always only a few kilobytes). Is it the
case that handling even this amount of data incurs a significant CPU
cost If not, then there's probably nothing to worry about here, and
you can do everything in a single thread. If it is the case, then you
might want to keep around a thread pool (or process pool, or cluster)
and push the filtering work to it, reserving the IO thread strictly for
IO. This is still a win, since you end up with a constant number of
processes vying for CPU time (and you can tune this to an ideal value
given your available hardware), rather than one per connection. This
translates directly into reduced context switch overhead.


Posted On: Wednesday 7th of November 2012 01:38:29 PM Total Views:  692
View Complete with Replies

Related Messages:

Re: All names in the current module   (272 Views)
Fabio Z Tessitore wrote: > to get names' list you can simply call globals() Not strictly true. globals() returns the current's scope global vars. If you import a module in the current scope globals() won't display the names inside it. -- Lawrence, - "It is difficult to get a man to understand something when his salary depends on not understanding it" - Upton Sinclair
retrieving ATOM/FSS feeds   (278 Views)
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I'm using feedparser library to parser ATOM/RSS feeds. However, I don't get the entire post! but only summaries! How do I retrieve the entire feed I believe that the parser library should have support for doing that or the specification should detail how it can be done Or should I simply get the feed link and do HTML scraping - -- _ _ _]{5pitph!r3}[_ _ _ __________________________________________________ I'm smart enough to know that I'm dumb. � - Richard P Feynman -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - iD8DBQFGv77mA0th8WKBUJMRAk80AJ9VCIBXIZVhuPtT7bfY4dRrM15H+gCeOVJG 77Zbl8jmWPsp4QjP85Lbwbc= =Ho+8 -----END PGP SIGNATURE-----
Formatting Results so that They Can be Nicely Imported into a Spreadsheet.   (380 Views)
Hi . Let's say I have some badly formatted text called doc: doc= """ friendid Female 23 years old Los Gatos United States friendid Male 24 years old San Francisco, California United States """ How would I get these results to be displayed in a format similar to: friendid;Female;23 years old;Los Gatos;United States friendid;Male; 24 years old;San Francisco, California;United States The latter is a lot easier to organize and can be quickly imported into Excel's column format.
File handle not being released by close   (307 Views)
I'm in the process of writing some code and noticed a strange problem while doing so. I'm working with PythonWin 210 built for Python 2.5. I noticed the problem for the last py file processed by this script, where the concerned tmp file is only actually written to when PythonWin is closed. In other words, after I run this script, one of the generated tmp files has a size of 0kB. I then close PythonWin and it is then written to. I'm guessing the garbage collector is causing the file to be written, but shouldn't close do this /Barry import os, time, string dir = 'c:\\temp1' def listFile(fileNames,dir,files): def f1(a,dir=dir): return os.path.join(dir,a) files2 = map(f1, files) fileNames.extend(files2) fileNames = [] os.path.walk(dir,listFile,fileNames) for fileName in fileNames: fileBeginning = os.path.splitext(fileName)[0] fileEnd = os.path.splitext(fileName)[1] if fileEnd == ".py": print fileName f=open(fileBeginning+".tmp", 'w') f.write("") f.close
Re: The Future of Python Threading   (310 Views)
On Fri, 10 Aug 2007 10:01:51 -0000, "Justin T." wrote: >, > >While I don't pretend to be an authority on the subject, a few days of >research has lead me to believe that a discussion needs to be started >(or continued) on the state and direction of multi-threading python. > > [snip - threading in Python doesn't exploit hardware level parallelism > well, we should incorporate stackless and remove the GIL to fix this] > I think you have a misunderstanding of what greenlets are. Greenlets are essentially a non-preemptive user-space threading mechanism. They do not allow hardware level parallelism to be exploited. In a hypothetical Python interpreter, equivalent to current CPython except for the removal of the GIL, you could certainly have greenlets running in different pthreads, and thus take care of hardware parallelism, but you could do the same without greenlets too. So Stackless Python is an unrelated matter. >There has been much discussion on this in the past [2]. Those >discussions, I feel, were premature. Now that stackless is mature (and >continuation free!), Py3k is in full swing, and parallel programming >has been fully realized as THE next big problem for computer science, >the time is ripe for discussing how we will approach multi-threading >in the future. Many of the discussions rehash the same issues as previous ones. Many of them are started based on false assumptions or are discussions between people who don't have a firm grasp of the relevant issues. I don't intend to suggest that no improvements can be made in this area of Python interpreter development, but it is a complex issue and cheerleading will only advance the cause so far. At some point, someone needs to write some code. Stackless is great, but it's not the code that will solve this problem. In the mean time, you might consider some multi-process solutions. There are a number of tools for getting concurrency like that. Jean-Paul
twisted - locking threads when deferToThread is used   (262 Views)
I'm writing an application which will be periodically testing servers. I will have a global list of these servers(domain names) and need to do few tasks. 1) DNS checks - I will use asynchronous twisted-names for it - in case there is a difference comparing to the list it should update the list(then also in DB) 2) ICMP pings - should be also possible to do it asynchronously 3) Blocking function which will be pinging with SIP requests - here I will use function deferToThread to make it non-blocking. Questions: 1) How do I lock each thread when writing to a global list in twisted 2) How will I put together all three results mentioned above in the global list - is it by using function callLater 3) Could you help me with a rough skeleton of this application, please Thank you, Lada
Re: question force Firefox   (314 Views)
On Monday 06 August 2007, dimitri pater wrote: > > I have a question regarding the use of > On a windows XP machine, MS-IE is set as the default browser so when I do: >'http://localhost:8080') IE starts with this address. > But in stead of launching IE, I want to launch Firefox *without* setting > Firefox as the default browser globally on this machine. > > Any hints, ideas Your help is most appreciated. take an educated guess where to find firefox, check if it's there, and do something like os.spawnl(os.P_NOWAIT, where_firefox_is, "http://localhost:8080") --
replacement for execfile   (322 Views)
Hi all! >From another thread (and the pointed PEP) I have found that execfile will not be present in Py3k. So, I am wondering what will be its replacement Considering that most probably Py3k will keep eval and exec, this will still be possible (indeed requiring manual loading of the file string), so I would really appreciate some enlightning comments on this. Background: Basically this question is related to my learning process/working project. In this, I have the need to allow the final user to provide a configuration like script, but built using my API (and whatever other normal Python code they want). For this, having execfile was a nice surprise as I was able to automatically expose the API (so the end user doesn't need to bother about imports) and also easily execute it in the context I wanted. many
=?us-ascii?Q?Re=3A=20Use=20variable=20in=20regular=20expression?=   (238 Views)
.... > Yesterday's date is 20070731, and assigned to the variable > "yesterday_date". I want to loop thru a directory and find all of the > yesterday's data ONLY IF the feature class has the date at the > BEGINNING of the filename. .... > I can't figure out the > syntax of inserting the "^" into the regex. > .... e.g. yesterdayRE = re.compile("^"+yesterday_date) .... should work (assuming yesterday_date is a string), but for that simple tests you may also try e.g. filename.startswith(yesterday_date) (with both filename and yesterday_date being strings). Greetings, Vlasta
Detecting __future__ features   (263 Views)
Steven D'Aprano wrote: > Is there any general mechanism I'd just use the expected future feature and if the result is not what I expect (or Python raises any kind of exception, like using a keyword not present) I'd think I'm in the past :-) -- Lawrence, - "It is difficult to get a man to understand something when his salary depends on not understanding it" - Upton Sinclair
Re: interpreter in the background   (308 Views)
En Sat, 28 Jul 2007 08:29:38 -0300, Andrea Tomadin escribi: > I need to use the Python interpreter as if it were a Matlab or > Mathematica "kernel", i.e. running in the background while I use an > interface program to send commands and get output. Try building something around the code module -- Gabriel Genellina
[python-list] pdf read & write   (284 Views)
Dear all, How can I read a pdf file and add invisible comment I want to make a script which read a pdf file and add tags inside the file invisibly. Then, I will make a script for managing tags of given pdf files. I know "referencer" can manage tags for pdf file but it seems store tag information to additional file outside pdf file. Any suggestion are welcome. Best, Hyunchul Kim
Re: removing items from a dictionary ?   (508 Views)
Stef Mientki wrote in > hello, > > I want to remove some items from a dictionary, > so I would expect this should work: > > Nets = {} > ... fill the dictionary Nets > > for net in Nets: > if net.upper() in Eagle_Power_Nets : > del Nets [ net ] > > > But it gives me > Message File Name Line Position > Traceback > D:\data_to_test\JALsPy\ 380 > RuntimeError: dictionary changed size during iteration > > > [snip...] > > > thanks, > Stef Mientki I think that you need to find a different iteration condition. Using for n in dict will iterate using an iterator which doesn't support "parallel" deletions. bests, ../alex -- ..w( the_mindstorm )p.
How to create a single executable of a Python program   (271 Views)
Dear fellows, I'm trying to create a executable file using py2exe . Unfortunately along with the python executable file it also creates some other files that are needed in order to the executable be able to run in a system that doesn't have Python installed. Can some one guide me on how can I merge all this files created by py2exe in a single exe file If I have a python program that uses an image file I don't want this image file to be exposed in the folder but only to be accessible through the program flow..
Base class for file-like objects? (a.k.a "Stream" in Java)   (300 Views)
, (sorry to begin with Java in a Python list ;-) in Java, when I want to pass input to a function, I pass "InputStream", which is a base class of any input stream. In Python, I found that "file" objects exist. While specifying argument types in Python is not possible as in Java, it is possible to check whether an object is an instance of some class and that's what I need - I need to check if an argument is a "file"-like object, and if yes, behave accordingly, if not, treat the argument as string with URL. But I am afraid there is no such a base class - I tried the following: >>> import urllib >>> >>> f = open("test.txt", "r") >>> g = urllib.urlopen("") >>> >>> isinstance(f, file) True >>> isinstance(f, file) False .... Is there some base class to "file"-like (or "stream"-like) objects in Python And if not, is it at least planned for Python 3.0
recursively expanding $references in dictionaries   (320 Views)
Oops, I left some redundant cruft in the function... here it is slightly cleaner: def expand(dikt): names = {} output = {} def _search(_, sourceDict): for key, value in sourceDict.items(): if isinstance(value, dict): _search({}, value) if not '$' in value: names[key] = value _search({}, dikt) def _substitute(targetDict, sourceDict): for key, value in sourceDict.items(): if isinstance(value, dict): new_target = targetDict.setdefault(key, {}) _substitute(new_target, value) else: targetDict[key] = Template(value).substitute(names) _substitute(output, dikt) return output print expand(d2)
Is it possible to run two "while 1:" loops in two threadings respectively?   (170 Views)
I would like to combine two python applications into a single one with two threadings. Both of them have a "while 1:" loop respectively. For example, one application is to monitoring serial port 'com1' and another application is a TCP/IP server which has used threadings already. I write the following demo code but it does not work right. It stays in the first "while 1:" and never thingOne.start(). The second threading never be started. Any ideas
wxPython, threads, and search engine   (266 Views)
I'm writing a search engine in Python with wxPython as the GUI. I have the actual searching preformed on a different thread from Gui thread. It sends it's results through a Queue to the results ListCtrl which adds a new item. This works fine or small searches, but when the results number in the hundreds, the GUI is frozen for the duration of the search. I suspect that so many search results are coming in that the GUI thread is too busy updating lists to respond to events. I've tried buffer the results so there's 20 results before they're sent to the GUI thread and buffer them so the results are sent every .1 seconds. Nothing helps. Any advice would be great.
Screen Scraping Question   (273 Views)
, I am trying to make a bot for a flash game using python. However I am having some trouble with a screen scraping strategy. Is there an accepted way to compare a full screenshot with the image that I want to locate It is a math based game, so I just have to check what number, 1-9, appears in the center of the flash game. Is there an easier method to do this
Direct Client - Sr.Soft Engineer-Python and Linux   (246 Views)
, We have requirement for Sr.Software Engineer in San Jose CA with very strong experinece in PYTHON AND LINUX . If your skills and experience matches with the same, send me your resume asap with contact # and rate. Locals only pls apply The details of the openings are:- Strong in Python and Linux Good in Java C++ Kan BTech Inc Recruiter 510-438-6834