Production and Maintenance: 2010

Thursday, December 30, 2010

Mesh Adaptivity

The research problem we are trying to solve right now is based on this paper. The paper describes a tetrahedral mesh adaptivity scheme based on "edge markings". Each face (2-cell) of a tetrahedron has one of its edges (1-cells) marked. A set of markings is valid if the following are true:

1. Whenever two tetrahedrons share a face, that face is marked the same way in each of the tetrahedrons. (That is, each face has a unique marking, even if it's in more than one tetrahedron.)

2. In each tetrahedron, there exists an edge that is marked on both of the faces that contain it.

The paper shows how to use these markings to do refinement. The problem we are trying to figure out is how to do coarsening. In particular, when you coarsen the mesh by contracting an edge to nothing, you merge pairs of opposite faces into one face, so the question is how to change the markings so that it is still valid. We want to do this in a way that involves changing the fewest number of markings.

There are a few different ways we can go about doing this:

1. The method described in the paper about how to set up the initial markings says that we can establish a total order over all the edges, then have each face mark the edge that is first in that order. This raises the question of whether or not this property - that there exists a total order over the edges such that the marked edge in each face is the first in that total order - is preserved over the refinement process. If so, then we have a simple algorithm for merging two faces when doing refinement: just choose one of the faces arbitrarily and have the new edges inherit that edge's position in the order, then remark them according to that order. This obviously will not affect any tetrahedron that doesn't contain one of the edges that was just merged, so we can keep the re-marking down to a small number of tetrahedrons.

When we generate a tetrahedron of type P, all the marked edges don't include the new vertex, so the property is preserved if we just put all the new edges last in the partial order. When we generate a tetrahedron of type A, it is a little more complicated - the property is still preserved but we have to put some of the edges between other edges in the order (I'll put a diagram up soon.) There is still a complication of when we generate a new tetrahedron where the new vertex was already generated (by refining a different tetrahedron that the refined edge was also a part of) because now we have to make sure that the same position for that edge in the order will satisfy the conditions on both of the faces I'm not sure yet if that's actually satisfied.

2. We could look at relations between this problem and the Boolean satisfiability problem (SAT). This problem has a very natural Boolean-satisfiability representation: the proposition "face F has edge E marked" is one of the variables for each pair (F,E), then the conditions that each face has exactly one edge marked as well as the conditions (1) and (2) above are easily expressed. If we wanted to try to minimize the number of changes we have to make then we could think of it as a sort of weighted maximum satisfiability problem where we try to also satisfy as many of the propositions "marking X wasn't changed" as possible. We could then use a MaxSAT solver like this one to figure out the best solution. This direction is probably not the best way of doing it because we have to convert the problem to a different form, solve it, and convert the answer back, plus the size of the problem will be the size of the whole mesh and we really want to only look at a small neighborhood of cells (for parallelization purposes if nothing else).

Another approach is to try to reduce the satisfiability problem to our problem; that would show that it is NP-complete. Of course that wouldn't give us an actual algorithm; it would just tell us we should stop looking for an exact polynomial-time algorithm. So this would be nice to have but isn't the first direction we should look.

3. Another way to think about the problem is to imagine each possible "switch" of an marking on one face from one edge to another edge as a node in a directed graph. Suppose that we switch a marking on a face of one tetrahedron to make the markings on that tetrahedron valid. That may make the markings on the adjacent tetrahedron (the one on the "other side" of that face) not valid because we also have to switch that marking on that tetrahedron. That may require switching yet another marking, and so on. Note that any position is only one "marking switch" away from a valid position, so we only have to do one marking switch at each step. In the directed graph, for each marking switch that necessitates another marking switch, put an edge to the possible choices of marking switches to make it valid, and for those marking switches that will make it valid on both sides and allow you to halt the process, mark those as "end nodes." Then the problem is just to find the shortest path from the change you are forced to make due to the merging to an end node, which is a simple matter of breadth-first search (or Dijkstra's algorithm, if we want to weight the marking switches by "how hard" they are to make). It will probably end up a little more complicated because an edge contraction affects lots of faces rather than just one.

I'll try to post up some more ideas in the next couple days.

Monday, December 20, 2010

Masters or Ph.D?

I just talked to Jeff Erickson last week, and he gave me some very interesting feedback. He said that I am very smart and willing to work very hard, and that I am able to get a lot done, provided that I am give a specific problem to work on. However, the one thing that I haven't shown myself to be abole to do yet is to make independent decisions about which directions to take my research and what are important research problems to work on, and that's what I will need to be able to do to get a Ph.D. He suggested that a better direction is to focus on getting a Masters' thesis first, and then at that point re-evaluating whether I want to continue toward a Ph.D. or just stop there.

The more I think about it, the more I think that a Masters may be the best option for me, for a few reasons:

1. First, one thing about getting a Ph.D. in computer science is that it doesn't actually involve writing that much code; or at least code isn't the primary product. As my advisor said, the main "product" you're producing as a Ph.D. student (and later as a professor) is ideas; code is just the way of implementing those ideas. But I think my specialty isn't necessarily really in developing theory, but more in the combination of math/theory and code. In particular, there are lots of people who are experts in the theory (or who are experts in a specific problem domain, like engineers) but aren't that good at actually writing code, and there are lots of people who can crank out lots of code without really understanding the theory, but I am very good at both understanding the theory and algorithms and translating it into code. So my main niche is to bridge that gap.

2. In particular, when I look at the things that really sound cool to me and "wow, I would really like to be working on this," it's generally not the theoretical stuff; it's more the stuff that involves taking a real-world problem, one not necessarily in computer science, and figuring out how to formalize and understand it in a mathematical context so you can write code to solve it. For example:

2a. At my work at the U.S. Census Bureau, we were studying disclosure avoidance in survey data - that is, how to disseminate data from local level surveys in ways that preserve the statistical patterns in the data but don't allow users to figure out who answered what to any particualr question. This question involves first defining specifically what it means for a respondent's privacy to be preserved (What if an adversary can make a good guess at the missing information but only some of the time? What if an adversary has access to a different database unrelated to the survey that they can combine with that information? Is there a way to prove things about what an adversary can figure out?), then to figure out what algorithms can implement that, then to code and test those algorithms.

2b. I was offered a job at Amazon.com after graduating from University of Maryland, though I chose not to take it so I could go to graduate school. One of the projects they were working on there was figuring out how to optimize their web pages to increase sales - so they had collected all this data about how the page layout and the time it takes to load the page affect users' behavior. Again, we have a practical problem, then there's a step of formalizing the problem (Exactly what function are you trying to optimize? Is it just number of sales, or are other things like which products get sold or how long it takes the user to go through the process also important? How can you determine which changes in output are caused by which factors? What data can be used to shed light on these questions?), figuring out what algorithms can be used to analyze the data, and then implementing and testing these algorithms.

2c. Another example of something I tried to do once was when I was learning how to produce foam tipped arrows for Belegarth (a game in which players dress up as medieval warriors and fight each other with padded replicas of medieval weapons). There are several factors that you need to consider when making the arrow, such as the type of foam to use, the weight of the arrow, how hard or soft the tip is, and so on. My first response was to try to figure out if there was a way to mathematically model what happens when the arrow strikes the target so as to better understand why the design is the way it is and if there is any way to improve it. I did some research and soon concluded that the formulas themselves wouldn't be overly complicated but that the data you would need to calibrate the model isn't readily available. But you can see this is another example of how I like to try to understand a problem by putting it into a mathematical framework.

3. Of course, most of the problems above aren't actually computer science research problems; they're engineering or business problems that have a math/computer science component. But I think that might be the kinds of things I want to work on: to work closely with the engineers or business people to understand what the computational needs are, then go and figure out what computer science techniques can meet those needs*. I talked to Bob Haber about this and he said that if I wanted to actually "do an application" for my Ph.D. thesis, I would essentially have to go back and take some undergraduate level engineering (or physics, or whatever) courses to learn all the field-specific stuff I would need to know to do research in that area. So if I want to do this kind of stuff, it would make much more sense to work at a lab or company than in academia.

So, the plan we have come up with is as follows. I'll work toward a Masters, and at the same time I'll also start looking for internships and jobs elsewhere. Los Alamos National Laboratories and Sandia National Laboratories do lots of scientific computing stuff so they are good possibilities. Then once I get a Masters, I will decide whether I want to stop there or whether I want to keep going.

*Although, I might be naively optimistic as to how much sophisticated algorithmic stuff is actually involved in these types of problems. I am reminded of a professor from University of Maryland (I think it was Bill Gasarch but I'm not sure) who, when he was teaching about job-scheduling algorithms, planned to bring his wife (who worked in a machine shop) in to talk about all the exciting new techniques they use to optimize their scheduling to improve production. He quickly cancelled this plan when she informed him that the actual algorithm they use is "whoever yells the loudest gets their job scheduled first."

Tuesday, December 7, 2010

Research Blog Launched

Right now, we are nearly finished with the prototype of our project; we have a program that implements the main TentPitcher algorithm, and it is integrated with the physics code so we can run the physics simulations, and we can also visualize the output. However,now is the time that we have decided that I need to move beyond just doing the implementation and coding work and start doing independent research. We still haven't decided yet exactly what my research topic will be, although we have identified several ideas. I will be using this blog to chronicle my progress in selecting a research topic and later in doing research.

By the way, the title of this blog is the name of the last phase of the turn in the civilization-building board game "Through the Ages." I thought it was funny because, to the uninitiated, "production and maintenance" sounds more like work than like something that you would do in a game. This title expresses my hope that getting a Ph.D. will be hard work, but that it will also be fun and exciting.