33

This question:Why would a language be untranslatable by universal (machine) translator? inspired me to consider, if one possessed a universal translator, is it reasonable to consider that it could be used to translated an encrypted message? Of course, it could be expected that if a UT could be used for decryption, that those encrypting would devise a way to thwart the UT, but could the UT be used to decrypt more primitive forms of encryption, such as an enigma machine generated encrypted message?

Edit: The question was asked as to how it would work. How I perceive it, it would have a complete understanding programmatically of how we communicate, in all forms, with updates when we find new ways of communication that hadn't been conceived before.

Edit #2: How it would work when encountering a language previously unknown is internally, it would look for similarities to known languages and if it finds a similar language, it would apply the rules of that language, until those rules fail, then seek another language that has a more successful rate of translating, and use that, repeating this testing until it finds one that best matches. In the event no known language matches satisfactorily, it would then apply rules of language structure...common words and phrases, until it finds a structure that works and use that. I am not a linguist, so I do not know if there are other tests. If there are, then they would apply, all programmatically. If all tests fail, then it would request additional information or wait until additional data was received...such as further conversation, or a chart of common words.

But, that is not what I am looking for. What I am looking for is, using whatever programmatic format, can a UT identify code?

One more point about changing encryption. It could be possible that a language changes pronunciation in a similar manner. That is, whatever encryption you envision could be the natural language of somewhere.

Sensii Miller
  • 511
  • 4
  • 8
  • 12
    That depends. How does your hypothetical universal translator know that its translation is correct? – Philipp Nov 21 '16 at 22:42
  • 5
    Philipp makes a good point, MS notepad can "translate" a program into text, but the result is gibberish. So your UT might happily try and translate whatever message it was fed without error, but encrypted messages might come out as gibberish. – Samwise Nov 21 '16 at 22:55
  • 1
    You might want to tag this science-based to rule out magic. As for the Enigma, since it is a flawed encryption system that leaves hints of the original message and its key space is very small compared to what even a modern computer can do, yes... but it's not particularly interesting to talk about deciphering flawed encryption. – Schwern Nov 22 '16 at 01:29
  • 11
    I think a key question you need to answer before asking such a question as this: if someone intends to deceive you into believing their words mean one thing when they in fact mean another, what does your UT do? When you call up a girl to ask her out, and she says "maybe later," does it translate that into "I don't like you that much?" – Cort Ammon Nov 22 '16 at 05:01
  • 4
    @Schwern The Enigma is claimed by Wikipedia to have had a theoretical key space of around 380 bits, reduced to around 76 bits in what might be considered reasonable situations. I haven't verified against the cited source, but the citation points toward the NSA (yes, that NSA). I'm not sure I would consider that flawed by design; it was secure enough that, had it been used correctly, chances are it wouldn't have been broken. It was lack of operating discipline that allowed the Enigma to be broken, not lack of cryptographic strength. – user Nov 22 '16 at 09:25
  • 2
    I'm now actually wondering if you could make a "halting problem" argument against the existence of a UT, but first you'd have to nail down the definition. – pjc50 Nov 22 '16 at 11:17
  • All answers so far have assumed a sort of mapping from one known language to another. But to really answer the question, you have to tell us what is "universal" about it? In other words, is it like Google Translate where both languages need to be known and one is mapped to another (not really universal), or is it like the one in Star Trek, where it can translate to any language, which kind of works by er... magic. If the latter, then you need to tell us, how the hell does it work, before the question can be answered. – komodosp Nov 22 '16 at 13:30
  • @pjc50 exactly what I was thinking. A naive definition of UT sounds too powerful in the context of computability. – Memming Nov 22 '16 at 14:05
  • @MichaelKjörling There was certainly a lack of discipline (like the cillies) but the Allies also managed to capture a pretty good amount of key material, so we might have gotten it even if the Nazis had been cleverer about it. – MissMonicaE Nov 22 '16 at 15:07
  • 1
    Languages codifications are meaningless and have no innate knowledge. It is all chicken scratch and effectively is no better than randomness. We give language meaning as we do music and many other things. Hence for any "universal" translator to work it must have been programming with the mappings from one language to the other. If an encryption is used, than all the universal translator needs is the algorithms. This is akin to a dictionary for any other language. – AbstractDissonance Nov 22 '16 at 15:10
  • 1
    @colmde The Star Trek universal translator does have a tendency to fail at, shall I call it plot-convenient times? That is, if it fails to do what's needed to ensure communication with an alien species, it always does so at times when such failure is a serious problem. Compare e.g. the Voyager episode The Swarm. – user Nov 22 '16 at 15:20
  • 1
    @AbstractDissonance "If an encryption is used, than all the universal translator needs is the algorithms." Plenty of people over on [crypto.se] and [security.se] would beg to disagree with you on that. – user Nov 22 '16 at 15:21
  • 1
    @AbstractDissonance "...all the universal translator needs is the algorithms. This is akin to a dictionary for any other language." Yes, if the universal translator had the secret key used to encrypt the message, it would know what it meant. But being a universal translator in no way helps you figure out what the secret key was. – Doval Nov 22 '16 at 20:35
  • 6
    One interesting, plot-convenient thing that might happen: the translator translates, but it's still encrypted. – SIGSTACKFAULT Nov 22 '16 at 20:38

20 Answers20

56

No

I will make the assumption that the universal translator generates a consistent mapping from words, expressions, or idiom in one language to another. I would also assume that it uses samples of language to map potential meanings onto any phrase or sentence, and through some super-advanced algorithm, quickly discerns the meaning of what is spoken.

The problem with cryptography, in this context, is that there is no direct mapping of a word (or expression, or idiom) from one language to another. If I say something, and public key encrypt it, and then you say the same exact thing and encrypt it, those two statements might not be the same thing! Their encoding depends on the user's key. Furthermore, if I say something today, and then the same thing tomorrow, when encrypted those won't necessarily be the same thing either!

The problem with encryption, is that there is no consistent 'language' to be translated. There cannot be a map of a concept like 'blue' to a discrete subset of an encrypted message. 'Blue' could potentially be encrypted any way, depending on what specific cypher is being used that moment by that user.

Therefore, based on the assumptions of how a Universal Translator works that I started with, the Universal Translator will not be able to translate an encrypted message.

kingledion
  • 85,387
  • 29
  • 283
  • 483
  • 9
    Yup, the difference between encoding and encryption comes to mind here. Encryption that satisfies Kerckhoffs' principle would have the unique key you speak of. Encryption that does not satisfy the principle is fundamentally insecure. – Bob Nov 22 '16 at 01:03
  • 4
    few languages have only direct mapping of words and expressions. most have contextual cues, subtext, and nuance. that's why google translate creates gibberish so often. A universal translated would need to actually be able to think to understand these. – John Nov 22 '16 at 01:48
  • 8
    @John The point is: given the same words, contextual clues, subtext, and nuance as input, the UT should generate the same output every time. With an encrypted message, given the same byte stream as input, the output could be different every time. – kingledion Nov 22 '16 at 13:01
  • 8
    I think the point everyone is trying to get at here is that true perfect encryption produces an output that is utterly indistinguishable from random noise to anyone who doesn't have the correct method and key to decrypt it. In technical terminology, this is what distinguishes encryption from encoding. A hypothetical perfect universal translator would be able to decode any encoded message, but would find as much meaning in an encrypted message as it would in the sound of running water, or the noise a Geiger counter makes when pointed at uranium, or TV static on a dead channel. – anaximander Nov 22 '16 at 14:42
  • 8
    I understand people's points about encryption, but I think you're underestimating how impossible of a task a universal translator is. There's a reason Egyptian hieroglyphs completely defied translation until the discovery of the Rosetta Stone. And the sorts of universal translation we see on star trek where the computer starts translating words immediately within the first 2 words without any real life context or reference is just as mathematically impossible. – Shufflepants Nov 22 '16 at 14:54
  • But if we weren't talking about UT's that are as magical as they are on star trek. If they actually have to listen to a language in context and/or ask questions back to a native speaker for a few hours to get even a rudimentary rapport, then I could definitely agree that having UT's won't break encryption. – Shufflepants Nov 22 '16 at 14:56
  • 1
    "underestimating how impossible of a task a universal translator is" -> brute-forcing UT is already a thing, albeit at an early stage of development. It is a near-certainty that this will be a real-world product in the future when the technology improves. – Alex Celeste Nov 22 '16 at 20:59
  • 1
    @Leushenko I was talking about the kind you see in star trek. Sure, if you're dealing with a language that is similar to and has common origin with languages you've already seen before, and you have a large corpus of text to study, automated UT is possible. But not without that corpus and without similarities to go on which would almost certainly be the case with an alien language. Your example is like showing me that there are automated cipher solvers and suggesting that it's a near-certainty that there will be a one-time-pad solver in the future when technology improves. – Shufflepants Nov 23 '16 at 18:31
  • Furthermore the cypher is longer than the single word. So if you say blue twice in the same sentence, it will have 2 different outcomes, since it is encrypted with different parts of the cypher. – Zsolt Szilagy Nov 24 '16 at 11:07
30

An encrypted data stream is statistically indistinguishable from a purely random stream, at least a data stream encrypted by a good encryption algorithm. Since the encrypted stream cannot be distinguished from a random stream, there is no structure on which a universal translator could work. If the encrypted data stream can be distinguished from a random stream then the encryption algorithm is broken; we should expect that when we will be smart enough to build a universal translator we will also be bright enough to write decent encryption algorithms.

AlexP
  • 88,883
  • 16
  • 191
  • 325
  • 3
    That significantly depends on the method of encryption used and upon the method by which the universal translator translates the language it is hearing. While there do exist Information-Theoretically Secure encryption methods as you describe, there are many, many other kinds of encryption that can be broken with enough computation power. "Enough" computation is very subjective, but if the UT has that kind of power it could possibly decrypt an Information-Theoretically Insecure message. – MozerShmozer Nov 21 '16 at 23:10
  • @MozerShmozer There's nothing about that scenario which differentiates a UT from any other powerful computer brute forcing weak or flawed encryption. – Schwern Nov 22 '16 at 01:09
  • @Schwern Which seems to be exactly the point of the question! – user253751 Nov 22 '16 at 01:16
  • 2
    The point I was making is that progress in AI (as the basis of the universal translator) will most likely be matched by progress in cryptography. We are already transitioning towards algorithms which are immune to quantum computing, although at present we don't have real/practical/powerful quantum computers (depending on your take on the current state of the art). When the universal translator will appear as a tiny speck on the horizon cryptography will have already introduced encryption algorithms immune to it. – AlexP Nov 22 '16 at 06:48
  • @MozerShmozer While you are correct, I think there are two things worth noting. First, encryption does technically describe a process that renders information indistinguishable from noise; not all "encryption" methods succeed in this regard but to information theory this is an implementation flaw rather than an aspect of encryption as a process. – anaximander Nov 22 '16 at 14:47
  • @MozerShmozer Second, many forms of encryption that can be broken today are broken by some kind of brute force, eg. encrypting possible plaintexts until one matches the ciphertext, rather than solely by finding and interpreting structure in the output. Where the ciphertext's structure is relevant, it's often relevant in determining the techniques and parameters used in the encryption so that the ciphertext can be decrypted by other means, rather than as a means to directly uncover the message. – anaximander Nov 22 '16 at 14:48
  • The encrypted data is NOT statistically indistinguishable from noise. It's based on some sort of language, which has some structure and therefore the encrypted data has structure as well and statistical analysis can be used to decrypt it. For example the letter 'e' in English makes up 12% of all usage. They used that kind of analysis during WWII to decode enigma messages. Modern encryption tries to protect against that but with sufficiently powerful computers anything can be broken. – ventsyv Nov 22 '16 at 21:40
  • 3
    @ventsyv: You are misinformed. Modern encryption algorithms cannot be defeated by brute force alone. They cannot. Computation needs energy and there are hard physical limits to how powerful a computer can be. In order to break a modern algorithm, say AES, you need to find a weakness in the algorithm or to wait for a mathematical breakthrough. Note that an encryption system is much more than an algorithm, and it is not unusual to defeat the encryption system as a whole because some part of it was weak; but this is outside the scope of the question. – AlexP Nov 22 '16 at 21:47
  • @ventsyv ...the letter 'e' in English makes up 12% of all usage. Generally true when analyzing English sentences. Unfortunately, it's worthless for any decently encrypted text. Under encryption, 'e' is effectively precisely as likely to appear as any other character regardless of the message. Such details are why decryption is very different from inferring semantic meanings. The "12%" type of detail is also useful mostly when you already know language rules of unencrypted text. – user2338816 Nov 23 '16 at 01:15
  • @ventsyv That's flat-out wrong. A one-time pad is perfect encryption. The result of encrypting with a one-time pad leaks no information at all about the original message. An encrypted message generated by a one-time pad could've come from literally any message of equal length and every single message is equally likely. So you can't do better than brute force, and as AlexP mentions physical limits make brute force impossible unless you're willing to wait thousands of years or worse. Typical encryption uses 128-bit keys. 2^128 is an astronomically large number of guesses. – Doval Nov 23 '16 at 02:36
  • @ventsyv Your knowledge of encryption seems to be out of date. It's all about entropy (in this context, Shannon entropy), which is a measure of how unpredictable a value is. In other words, given a portion of the message, how reliably can I predict the next letter? With ideal encryption, the answer is not at all. A one-time pad message, for example, could be decrypted into any message of the same length, given the right key. If you don't know that key, you physically cannot know what the message is. Yes, frequency analysis was used in WW2. Encryption technology has moved on since then. – anaximander Nov 23 '16 at 11:31
  • @Doval You can't even brute force a message that is encrypted by a one-time pad, as you have no way to verify that you have the correct plaintext when every other plaintext is equally likely. – Paul Nov 23 '16 at 15:52
  • @Paulpro True. I only brought up the OTP to show that encryption doesn't have to leak info from the original message. I ran out of space in that comment but my intention was to talk about brute force attacks on other ciphers, where usually only one key produces a message that's not complete nonsense. – Doval Nov 23 '16 at 16:05
  • There was a story titled "Cryptic" in Asimov's or Analog magazine many years ago. A radio astronomer on Earth intercepted messages between some distant solar systems, and was using his computers to try to decipher them. Eventually he realized that the messages had no "variation" -- as if every "word" or "letter" occurred the exact same number of times as every other. He deduced that the messages were in code to prevent interception, and that the civilizations must be at war. – Shawn V. Wilson Nov 24 '16 at 21:27
13

The answer is unfortunately simple:

There are no universal translators in real life, so the capabilities of a fictitious universal translator in a story are 100% defined by the author's choice of such capabilities.

Thus the answer to this is plain and simply "whatever you want it to be." That being said, there would be some requirements on a UT that can decrypt encrypted text.

  • It needs to be effective at reading between the lines to find the true meaning. What you are describing would only make sense if the translator tries to find the real meaning of things, not just what is said. A translator that can unmangle enigma encoded messages would be able to read between the lines for a hapless male character who is obliviously missing the hints his wife is giving him. (come to think of it... can you make me one of these!?)
  • The mere fact that it is possible to translate such meanings should suggest that there is some degree of commonality between all languages. In linguistics, it is not known whether such a commonality exists. Chomsky famously suggested that the concept of recursion was universal to all human languages -- a so called "universal grammar." Strange human languages such as the Pirahã language make it difficult to argue this commonality exists in humans. Commonality between sapient species is beyond that, and in the realm of the author.
Cort Ammon
  • 132,031
  • 20
  • 258
  • 452
  • 3
    -1 If it's "magic", sure, but in that case the OP wouldn't be asking the question. But assuming even the most basic laws of Information Theory hold, languages have patterns whereas encryption is trying to look as much like random noise as possible. See @AlexP's answer. – Schwern Nov 22 '16 at 01:12
  • 1
    @Schwern And yet, literally by definition, encryption contains patterns. Otherwise it'd be pretty boring to receive an encrypted message! Hence why I focused on other attributes that would have to go hand in hand with such an ability. – Cort Ammon Nov 22 '16 at 01:16
  • 3
    No, by definition strong encryption contains no hints at the original message. For example, if I use a well-generated one-time-pad properly the simple message HELLO can become literally anything. Using the key XMCKL it becomes EQNVZ. It is impossible to break that encryption without already knowing something about the message or the pad because the message I sent could be any possible 5 letter string. It's only the key that gives it meaning. Any hint of the original message in an encrypted message is a flaw in the encryption algorithm. – Schwern Nov 22 '16 at 01:25
  • 1
    @Schwern I suppose one time pads are the unique case, because their particular strength happens to be what they ask for here (although it does leak information, such as message length... OTPs arent perfect for everything). However, given a message that is longer than the key (rejecting OTPs), there IS information in the data, even if it may seem random. The fact that encryption algortihms are broken regularly shows the information is there. – Cort Ammon Nov 22 '16 at 02:24
  • 2
    OTP, properly used with padding, only leaks max message length, but I used them as a simple demonstration that good encryption means random messages. Good encryption algorithms aren't broken in one go, and it's not done by looking at the messages for patterns. Instead, the algorithm is examined for flaws which can be exploited to reduce the scale of a brute force attack. Without knowing the algorithm, having the message is useless. Understanding a language isn't about finding minute patterns in large amounts of noise anyway, it's about seeing patterns and interpreting them, more like decoding. – Schwern Nov 22 '16 at 03:44
  • @Schwern Modern encryption algorithms are publicly known, and still are safe from decryption without the key. – Borsunho Nov 22 '16 at 12:48
  • @Schwern I think the difference between decryption and language is that, in the case of language, the other person intends you to learn something about the language, while in decryption the desire is explicitly to minimize communication to those who lack the key. However, when you're talking about handwavium along the lines of a universal translator, the line between those may get more blurry. – Cort Ammon Nov 22 '16 at 15:54
  • 3
    @Schwern AlexP’s answer notwithstanding, a real Universal Translator would presumably break mathematical laws in similar ways to what’s needed to break strong encryption. I say “presumably” because this may depend on things about languages that we do not yet know. So this answer is spot-on; a UT is a mathematical impossibility so positing it means anything goes: in formal terms, you can infer anything from a false premise. – Konrad Rudolph Nov 22 '16 at 16:02
  • That said, the answer incorrectly cites the Pirahã language as a counter-example to universal grammar. This is a common misunderstanding (nor is it a counter-example to the universality of recursion or numerous other commonalities). – Konrad Rudolph Nov 22 '16 at 16:05
  • @KonradRudolph Is my statement actually incorrect? I worded it carefully as "Strange human languages such as the Pirahã language make it difficult to argue this commonality exists in humans." I thought that was a safe construction which captured how much unsettling this language caused, without actually claiming that it invalidates any UG related claims. If that too is incorrect, then I suppose I'm the one who gets to learn today =) – Cort Ammon Nov 22 '16 at 16:09
  • @CortAmmon The majority of linguists would argue strongly that the Pirahã language hasn’t caused unsettlement (and that Everett’s claims are either flat out invalid or a misunderstanding/misrepresentation of Chomsky’s writing, see this quote: http://pastebin.com/raw/F9k7GKD8). That said, a vocal minority maintain the opposite. At any rate, I don’t think this is that relevant to the question at hand, and the PIrahã language would certainly provide a challenge for any real Universal Translator, regardless of the validity of UG. – Konrad Rudolph Nov 22 '16 at 16:44
11

One important aspect that all naturally occurring (and some invented) languages have in common is that it has native speakers who learned the language by listening to it. I won't go so far as to say that the languages were "designed" to be learnable, but any language that isn't human-learnable wouldn't last very long.

Encryption, on the other hand, is specifically designed so that a listener who doesn't know the encryption will be unable to decipher it.

A translator that is designed to be able to learn like a human (no matter how much accelerated) won't be able to crack encryption (and the encryption being cracked is crackable by one or more of its algorithms) unless the ability to crack encryption is specifically included, just by virtue of this difference in "design" of language and encryption.

EDIT: This doesn't even go into the technical side of why encryption is difficult to crack, which is a much bigger question (and mostly irrelevant once you consider the above).

Trevortni
  • 1,342
  • 8
  • 15
  • I would say that languages are intended to be easily understood, so your initial point is wholly valid. – Tony Ennis Nov 23 '16 at 00:05
  • i was trying to avoid the question of intent for something that evolved into its current form instead of being designed a certain way from the beginning. – Trevortni Nov 23 '16 at 00:18
  • 1
    But if I weren't trying to avoid that, I would have just said that languages are designed to be understood by anyone who cares to learn, while encryption is designed to be understood only by those who are entrusted with the knowledge needed to crack it. – Trevortni Nov 23 '16 at 00:19
5

No.

This extends on some other answers going into more details, but the short summary is that every language by necessity has patterns. A grammar, a fixed vocabulary, semantics, all of that. A well-encrypted data stream should not have any meaningful patterns, it is indistinguishable from noise or randomness.

A universal translator by definition does not have a database of languages (it wouldn't be universal, or the database would have to be infinite), but rather "learns" a new language it encounters. The only way to do that is to somehow (handwaving) decipher the patterns in the new language. On an encrypted data stream, it would not have any patterns to identify, thus it cannot translate/decrypt it.


Even shorter answer:

No, translation and decryption are completely different processes with different rules and methods, and a system capable of one isn't automatically capable of the other.


Finally, answer with caveat: When you go into the realm of very, very simple encryption (Ceasar ciphers and other substitions), then yes your translator would be able to translate encrypted messages, because these primitive ciphers don't hide the patterns. Which is exactly why they are so trivially easy to break that you can do it with a pen and a sheet of paper.

Tom
  • 8,923
  • 17
  • 40
4

One thing that wasn't mentioned yet is that a universal translator would be perfect for codewords and code phrases.

A codeword or phrase is basically a substitution for a normal word or phrase, for example a "tank" could be called a "can" and killing someone could be "taking out the garbage".

A universal translator could pick that up as a local dialect, synonyms or sayings and then automatically start translating them not even knowing it was meant as a code.

Daan Bakker
  • 141
  • 1
  • I would expect a translator to translate literally, unless the dialections were included in the programming. – Sensii Miller Nov 22 '16 at 17:05
  • 1
    I can understand your standing. I think of the line in Ocean's Eleven where the one guy was talking about being in a Barney. He had to explain, Barney = Barney Rubble = Rubble = Trouble. I doubt anyone outside his flat understood that. – Sensii Miller Nov 22 '16 at 17:07
  • 1
    @SensiiMiller, but if Barney was used consistently and always referring to the same scenario, then the UT would be able to infer its meaning from context. – Arturo Torres Sánchez Nov 23 '16 at 04:35
  • I don't think it could deduce the meaning without direct input defining it. Being a universal translator, it accepts input and provides output based on programming. It is a machine so it does not understand meanings of feeling words, like "trouble", anymore than a language to language translator has an understanding of feeling words. I guess it is possible to program it to understand the types of input that define feeling words, but that could allow it to be corrupted by faulty data input or an antagonist who deliberately provides data input that corrupts its ability to learn....interesting. – Sensii Miller Nov 23 '16 at 19:50
4

Universal Translator as a cryptanalysis engine

The proposed concept isn't that far fetched. It's worth noting that the field of machine translation starts with this quote:

“When I look at an article in Russian, I say: ’This is really written in English, but it has been coded in some strange symbols. I will now proceed to decode’” (Weaver, 1955)

Any non-magic implementation of an Universal Translator must be a de facto cryptanalysis engine that attempts to divine a decoding process for a previously unknown communications system, given some sample data.

It doesn't mean that it can magically start solving unsolvable problems. Secure modern encryption algorithms are not vulnerable to analysis that a non-magical UT can perform. The following styles of encryption should be in scope of what any UT must be able to do to be UT:

  • All the various styles of substitution ciphers - they map directly to translation problems between e.g. text in different alphabets;
  • Many approaches of steganography - detecting which parts of observed behaviour are relevant for the language (e.g. is meaning communicated by the pitch shifts in sounds coming from the vocal tract, the twitching of the eyes, or both?) is comparable to the problem of detecting hidden messages in large data streams;
  • repeatable block cipher analysis - mapping fragments of noise to units of meaning, given vague data and guesses about what it might represent is comparable to breaking https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#Electronic_Codebook_.28ECB.29 ciphers.

However, language (no matter how alien) is substantially different from an encryption system in that it is meant to be understandable and learnable, either explicitly designed so or naturally evolved towards that. Properly designed and used encryption systems do not leak information that can be used to learn it, so a non-magic universal translator doesn't have available data to learn from.

Peteris
  • 6,979
  • 1
  • 21
  • 34
3

This sounds like a sword and armour problem.
A sword that can cut through any armour forces the design of an armour that can't be cut through by any existing sword, which forces the invention of a sword that CAN cut through that armour, which forces the design of better armour,....and so on.

Similarly, a translator that can decrypt encrypted information forces the design of better encryption algorithms. Now, unless the translator is specifically designed as a snooping tool, that's where it stops, otherwise you get an encryption arms race.

The answer, for a legitimate device, would be NO.

nzaman
  • 11,960
  • 2
  • 23
  • 55
3

I am the one that wrote the encryption concept in the referenced link.

My intent behind the idea that a language could EVOLVE with a natural encryption, not necessarily a digital or binary type encryption. While I was considering the idea, I was thinking of a reason why encryption would be required... and I imagined a society that only communicated via sonar or pheromones in a crowded space. In order to separate conversations from the noise of other conversations, a natural encryption system was developed. The parties involved in the conversation had the key to decode, and everything else that was indecipherable was just considered background noise.

To Answer the Question:

Yes

I believe that it would be well within the ability of a Universal Translator to decrypt KNOWN encryptions, especially ones that were created dozens/hundreds of years ago.

This knowledge base of encryption techniques and strategies could just as easily be applied to new variations, and would require much of the same logic, analysis and intuitive leaps as parsing a newly discovered language. Many of the same problems would apply too... for example, not having enough data (short message bursts), limited reference points, design of encryption too unique (grammar), limited ability to extract meaning, etc.

Ultimately the UT is an AI computer and would not give up (humans might stop the processors at some point though). Eventually given enough data and processing power the UT could crack the encryption.

However, I wouldn't necessarily make such encryption breaking UTs available to the general public... it would be in the hands of government, military, etc.

Often in history, encrypted communiqués are decoded but their information may no longer be viable or valuable past a certain period of time (such as in military operations).

Do not expect that your currently encrypted data will be safe in 10+ years unless you continually update it with the latest technology.

Phil M
  • 1,471
  • 7
  • 7
2

IT-guy here

Depends on the encryption-scheme in question. For keyless encryption-algorithms it's pretty trivial, since every word can be directly translated into an encrypted word and vice versa. As an example one could name ROT13.

For encryption with keys, things become at least hard to impossible. Consider for example the Caesar-Cipher, which is basically ROT13 with an arbitrary rotation-offset. The issue here is obviously that we need the key to get a proper result. We can of course guess the key, but this requires a way of distinguishing the correct output from rubbish that gets produced when we translate with a wrong key. This way may exist for certain inputs, but there is as well quite a good chance, that another key results in another text that is valid as well.

For modern encryption-algorithms this becomes even worse. Caesar-Cipher has a key between 0 and 26. AES256 encrypts with a 256bit long key. So on average we need 2 ^ 256 / 2 = 2 ^ 255 guesses before we get the correct key. With a quite good chance of finding a few other keys that output other "correct" output as well (correct in the sense that the UT understands it). In addition to that, the computation will take for ever with current super-computers and the worldwide per year produced energy would only be sufficient to test out a fraction of the keys. And at that point we haven't even done any testing whether there is a valid output.

So in short:
For keyless encryption-schemes: sure, it's just a simple mapping. For encryption using a key: theoretically yes, but the encryption needs to be extremely weak and you need to be lucky in terms of the encrypted text to get a definite output. For any stronger encryption you need an enormously strong UT (far beyond what is possible with state-off-the-art resources). Additionaly technology moves also on for encryption. E.g. there are already public-key-encryption algorithms in the making that are resistant to quantum-computers. You'll never be capable to build a UT that is able to translate ciphertext that was translated with state-of-the-art encryption, simply because that's exactly what encryption is supposed to do.

  • In simplest terms, a UT translates something representing meaning in one form, to something representing equivalent meaning in another form. For keyed encryption, the meaning is essentially split between the key, the ciphertext (and one other thing). Imagination might present a computer that could try all possible keys at once, but the trick (as you note) is identifying when the correct plaintext has been produced. So that "one other thing" is knowledge of which of all possible keys is the correct one. That (with public/private key encryption) is only known to the holder of the private key – jinglesthula Nov 22 '16 at 20:37
  • Hm, well - and two other things. Knowledge of the encryption scheme used, and the decryption algorithm for that scheme. The first may be unknown (and is the most serious deal-killer for the UT). Knowing the first we could assume knowledge of the second. – jinglesthula Nov 22 '16 at 20:40
  • @jinglesthula well, one essential problem is exactly that meaningfulness. It's just like with natural language. The meaning might alter entirely based on what language/algorithm and key is used. There are quite a few examples for this in natural language (quite often referred to as "false friends"). For binary codes this gets even worse. For example 0x6869320A could mean "hi!" in ASCII or represent a number. Probably it's a magic number identifying a certain type of file, ... Encryption-algorithms that aren't reversible are called hashes. But that point is rather about math than knowledge. –  Nov 22 '16 at 21:01
2

No.

An encrypted message is, by definition, incomplete.

When you encrypt a message, you then have two pieces that you need to unlock it:

  • The key
  • The encrypted message

The Universal Translator wouldn't have the key.

Note that the key can be arbitrarily long (eg. Book Cipher), and if you come up with the wrong key, you could decrypt the message to a completely different result, failing the translation.

G_Gus
  • 21
  • 2
2

It depends on how the translator is implemented, but probably No.

There are two major categories of translator: language analysis (Star Trek), and mind reading (Hitchhiker's Guide to the Galaxy).

Language Analysis

In the case of the former, there always appear to be limits and certain languages are just so alien that they cannot be translated. Example: Tamarian in Star Trek and the Orz in Star Control. I think that for reasons stating, if the engine is making a best guess effort to analyze the data based on common thought patterns, then obviously the same limitations that would apply to species that fundamentally reason differently would clearly extend to cryptography, as good cryptography means that no analysis can really be done.

Mind Reading

If the translation works by directly accessing the mind of the creature speaking, then it could work assuming that the device knows how to directly read the thought patterns of the one speaking. An obvious example of where it would definitely work is if humans had an implant (in their brain or vocal cords) that would encrypt the speech as it's being spoken, and a second implant that would do the translation probably accessing a public key over wifi or some such.

On the flip side, this sort of translation would be extremely unlikely to work (at all) when communicating with sentient robots, whether or not the speech was encrypted.


Of course, a translator could be implemented that attempts to use both approaches simultaneously, but clearly it would not work in the case of sentient robots speaking under encryption.

durron597
  • 281
  • 1
  • 8
  • What about the Farscape translators? – Sensii Miller Nov 23 '16 at 22:05
  • @SensiiMiller haven't watched Farscape. Could you elaborate? – durron597 Nov 23 '16 at 22:20
  • It has been a while. My memories of it are a bit fuzzy. Here is the Wikipedia page: https://en.wikipedia.org/wiki/Farscape . An astronaut is trying to reach light speed travel by skipping off the atmosphere. As he almost reaches it, a wormhole appears and swallows his ship then disappears. He ends up somewhere else, travelling a great distance through the wormhole. There are sentient beings there. Obviously, they do not speak English, or any other Earth language. He gets captured(rescued) by some people running from the stellar law enforcers. A little bug-like machine injects him – Sensii Miller Nov 23 '16 at 22:25
  • with something (a UT) that allows him to understand the beings on the ship and they understand him because they have the same UT injected in them. I forget the explanation of how they work. You should look for it online. It was on Netflix for a long time. I highly recommend it if you are a fan of the two you mentioned. – Sensii Miller Nov 23 '16 at 22:35
2

Yes, provided a large enough sample.

To learn a language, an universal translator needs a large corpus of texts in that language, ideally with some information about its meaning. From that corpus it can infer information about the language.

A large sample of encrypted texts with the same keys won't be different. It is just a language with different rules - maybe a bit more complicated - but that rules can be inferred from the sample. In fact, it would be possible for cryptographers to break any code given enough messages with the same key and enough computer power.

Pere
  • 4,151
  • 12
  • 20
1

in order to translate nuance, subtext and handle contextual meaning a universal translator would have to be an AI. capable of actually understanding the language not just an algorithm. Such an AI would be able to break the simplest encryption (substitution, ciphers, ect) relatively easily, but more advanced encryption would just as hard for it as it would for humans.

If you want really hard science ideally both sides would need their own UT which would communicate together at high speed to learn each language. simply not having one on each side would make it much harder and require cooperation.

John
  • 80,982
  • 15
  • 123
  • 276
  • 3
    -1 An encrypted message is (nearly) random noise. Training an AI to understand language patterns, nuance, subtext, contexts... does nothing to prepare it to break encryption anymore than a person with a degree in linguistics makes them a good modern cryptographer. All you've got at that point is a powerful computer, the AI is irrelevant. – Schwern Nov 22 '16 at 01:16
  • hence the term simplest, things like replacement, code phrases, rotation, or filler. The question specifically asks about primitive encryption as well. – John Nov 22 '16 at 01:26
  • not to mention using an obscure language has been used for military encryption more than once. just look at the US code talkers. – John Nov 22 '16 at 01:33
  • I see now the OP did ask about flawed encryption like Enigma. Shame, makes the question so much less interesting. Apologies for the down vote, but the basic argument still holds: your AI is nothing more than a fast computer. As for "code talkers" that's not encryption, that's just a code. I did an answer about that over in History.SE. – Schwern Nov 22 '16 at 01:35
  • I think we are falling into a language trap, encryption means different things depending on context, in history a cipher is a form of encryption, for computers it means something else entirely. . – John Nov 22 '16 at 01:47
  • 1
    there I think that edit should help clarify. – John Nov 22 '16 at 02:04
  • That last paragraph reminds me of Colossus: The Forbin Project where US and Soviet AIs start communicating with each other in an increasingly rapid and impenetrable language. – Schwern Nov 22 '16 at 03:53
  • The communications of the Navajo code talkers were encrypted AND in Navajo. That's why they were called code talkers.

    Bletchley Park had Japanese speakers, German speakers, Italian speakers who could translate the Enigma messages once they were decoded.

    Even if the Germans managed to decrypt the US communications, they didn't speak Navajo - so it provided an extra layer of secrecy.

    – Yvonne Aburrow Nov 24 '16 at 11:45
1

No.

Given an encryption algorithm A and a secret key K, we can map from a string in English to a new string. We'll call this "language" AK. The product of encryption with algorithm A using key K is a string in AK.

Dog -> [cyphertext]

Assuming our universal translator is capable of exhaustively trying every possible language in existence, it should eventually try the language AK. This will result in:

[cyphertext] -> Dog

However, it's also going to try every other language in existence. There likely exist other keys that result in:

[cyphertext] -> Cat [cyphertext] -> [a secret so terrible, the universe is instantaneously destroyed] [cyphertext] -> [the recipe for Coke] [cyphertext] -> [cyphertext]

It'd be possible to reject spurious combinations of A and K with a large enough sample size (we get garbage output that isn't English), but that assumes that we know the language we're decrypting to.

Since we're testing every language that could ever exist, we can handle double, triple, etc. encryption - just create a language AK•AK that represents two rounds of encryption, then translate from AK•AK to English. So that shouldn't be an issue.

There are an infinite number of languages, and it's not even possible to recognize all of them with a Turing machine, but we're already assuming the universal translator has gotten past that little computability problem.

So, if you have a universal translator that can do uncountably infinite quantities of work at once and is capable of figuring out what a valid string in the target language is, then yes, you can decrypt anything.

..so basically, the answer is gonna' be no.

etherealflux
  • 111
  • 3
1

Simply put, the answer goes like this:

An encryption is about turning an amount of data segments (not necessarily words, mind you but we can assume words) scrambling the data (e.g. mixing all the letters together to form one giant word or series of words) applying an invertible numerical transformation to the data (e.g. adding 1 to every letter and shifting it), mixing some more, doing more transformations, etc. Finally the data is sent off and the result is undone. DNA is a good example of encryption. Programs are encryption. Integrals (if you know calculus) are encryptions (because the inverse to a specific integral is always unambiguous). If your translator handles general encryption, this implies your translator can do advanced mathematics, decompile programs and even understand DNA sequences (note: we don't even understand all of DNA). Also, consider that not all programs are writable in the same language. Compilers turn specific languages into specific binary. They can be the same, but they usually have different styles based around the features they use under the hood. Trust me, decompiling would not be trivial.

Now, take your universal translator. Let's assume it can hear and parse a sentence and translate it to your language. A general idea behind languages is that all of them have words and that words are pretty much atomic. It wouldn't make sense to have a language where you cycle between every word and say one syllable in a loop.

In that world the sentence: "An ex-am-ple sen-tence" would be said as: "an ex sen am tence ple". Needless to say, it would be gibberish. Words are atomic. They don't get mixed. Encryption is all about the sub-atomic manipulation. Break the words apart. Mix up the letters. Don't mix the meanings of the symbols of the words. Manipulate the expression of the words, which is where your translator would inevitably fail without a serious upgrade beyond language skills.

Even if your universal translator needs no context, it's a much different problem to solve encryption then it is to translate a language. We can argue that two languages both have a way of expressing "blue". However, encryption distorts this. The phrase "a blue bird" is no longer separable. You cannot look at just "blue". Do this on a large enough scale and you propose a "language" where every sentence is a particular symbol. At that point, expect processors to burn up. Plus, it is expected that languages be spoken in some way. Therefore, the translator learns from hearing, not reading. Anything, not audio-visual is nonsense to it. Converting the written language can be handled differently, and we can assume certain rules. If it fails... well that's just a limitation of the theory of languages and what we expect a language to be, not the device.

to sum it all up:

Unless you just desire a universal computer calculus solver/DNA parser/assembly code decompiler/language translator... there is nothing that says it must be this.

In fact, go over to mathematics and ask about mappings from one set to another AKA functions. Your universal mapper (rather than translator) goes from merely universally being capable of determining the one to one mapping of thoughts and ideas (aka words) to being capable of determining the one to one mapping of any data set to any data set. In other words, I feed your thing an sampling of a function at different points... it gives me back an exact expression of that function. Good luck dealing with unclosed forms.

user64742
  • 1,493
  • 9
  • 12
1

Universal Translators as they are depicted in fiction are not something we can explain with science.* To explain them we need pseudo-science or magic.

Pseudo-science and magic are both capable of breaking any encryption the author wants them to break.


*There may be a few rare stories involving somewhat realistic translators where actual in-depth study of a new language and associated culture is required before the translator can partially translate. However, the common form of universal translators used in fiction only needs to pick up a few sentences of a language before they can translate the entire works of Shakespeare, which is magic.

Peter
  • 4,471
  • 1
  • 12
  • 18
0

Depends on the encryption, depends on the capabilities the author has chosen to give the translation device.

Encryption covers a broad range of approaches. From a purely textual standpoint, simple forms of encryption like substitution ciphers will not be any more difficult to translate than an unknown but not yet translated text. But modern encryption that outputs ciphertext that is seemingly indistinguishable from random noise will probably be on the more difficult end of things to crack.

How generalized and clever is this translator? Does it translate between known languages? Can it take a previously unknown language and derive meaning from whatever is being communicated? Can it translate concepts that don't exist in one language from another? Can it detect things like irony, body language and other contextual cues to meaning? What if the contextual cues are completely different between species? Does the translation device understand meaning derived from cultural context? How about meaning derived from personal experiences? A machine smart/magic enough to quickly solve such problems without human assistance might be smart enough to do anything.

Hypothetical example: A human ship encounters an intelligent aquatic species that communicates through bioluminescence and tentacle gestures. The meaning varies depending upon the salinity and temperature of the surrounding water and the pheremones being emitted by the participants in the conversation. The alien ship is transmitting a video/audio/olfactory signal to the human ship. It is a packetized protocol of alien design, communicated over 25 frequencies in the terahertz range. Their number system is base 11 and they can't see in our visual spectrum. How long does it take the universal translator to figure out what is being sent, figure out what it means and then communicate the captain's greetings? Bonus challenge: the aliens wish to serve man.

Jim W
  • 211
  • 1
  • 4
0

Nope.

A lot of details are missing, but I think it's safe to assume your "Universal Translator" learns meaning from sample texts. With most encryption, every piece of text uses a 'key', which makes it indistinguishable from random gibberish without the original key.

A Caesar-cipher or similar means of encryption simply don't provide enough randomness for your Universal Translator to have difficulty with though, so your Universal Translator would circumvent it as a means of securing information. An Enigma-machine's ciphering method does have some means of protection which supply a sufficient amount of entropy to protect against a Universal Translator though.

Landon Powell
  • 371
  • 1
  • 2
  • 3
0

No

Given this ciphertext

9686961375462206147714092225435588290575999112457431987469512093
0816298225145708356931476622883989628013391990551829945157815154

Each of the following lines are equally valid decryptions using one-time pads invented in 1882, a hundred and thirty four years ago. Long before the first digital computer was built.

The enemy will attack at dawn tomorrow.
The enemy will retreat at midday today.
Coffee, Bread, Sugar, Card for Penelope 
Thanks for your kind words about my mum
Take back those vile words about my mum
I will now give you your weight in gold
Leave this planet today or die in agony

They are just as valid as this decryption using RSA and a specific key

The Magic Words are Squeamish Ossifrage

Your Universal translator would have no way to pick one sentence from among the vast number of possible grammatical meaningful sentences.

RedGrittyBrick
  • 384
  • 1
  • 7